A comprehensive web-based healthcare data analysis assessment platform designed for evaluating analytical thinking and problem-solving approaches. Features secure candidate invitation system, comprehensive activity tracking, and professional evaluation scenarios based on real-world healthcare data integrity challenges.
Perfect for:
- Healthcare Data Analysis Assessment - Evaluate analytical thinking with real-world data integrity scenarios
- Professional Candidate Evaluation - Secure invitation-based assessment system with comprehensive tracking
- Data Analysis Interview Process - Focus on problem-solving approach rather than SQL proficiency
- Business Intelligence Evaluation - Test data validation, reconciliation, and pattern analysis skills
- Healthcare Data Consulting - Assess experience with accounts receivable, claims processing, and payor analysis
- Unique URL Generation - Create secure, time-limited candidate assessment links
- Email-Based Invitations - Professional invitation management with expiration tracking
- Usage Monitoring - Track when links are accessed and prevent unauthorized sharing
- Admin Impersonation - Admin can view assessment as candidates with full audit trail
- Access Control - Remove public access, candidates only access via secure invitation URLs
- Complete Query Logging - Every SQL query captured including syntax errors and performance metrics
- Think Time Analysis - Calculate time between candidate activities to measure analysis approach
- Tab Switching Detection - Track when candidates navigate away (potential external consultation)
- Page Visibility Monitoring - Detect minimization, tab switching with return duration tracking
- Session Activity - Login events, page views, impersonation activities with IP and user agent
- Data Integrity Challenges - Real-world healthcare data validation scenarios
- Account Balance Validation - Test business rule: Balance = Charges - Payments - Adjustments
- Cross-Table Reconciliation - Validate rollups between accounts and transactions tables
- Payment Analysis - Insurance vs patient payment patterns and reconciliation
- AR Status Distribution - Analyze account receivable patterns by service date and payor
- Transaction Crosswalk Integrity - Comprehensive data quality audits
- Candidate Invitation Management - Create, track, and manage candidate assessment URLs
- Activity Timeline View - Complete chronological view of candidate behavior with think times
- Query History Analysis - Review all queries including failed attempts and error patterns
- Performance Analytics - Success rates, execution times, and analytical approach insights
- Evaluation Reports - Export comprehensive candidate assessments for interview review
- Professional Assessment Interface - Clean, distraction-free evaluation environment
- Embedded Schema Reference - Healthcare data structure and business rules built into challenges
- SQLite Constraint Documentation - Clear notes about CTE limitations and date function differences
- Activity Visibility Indicators - Visual cues for page visibility changes and think time patterns
- Mobile-Responsive Design - Seamless experience across devices for flexible assessment locations
- Impersonation Mode - Admin can experience assessment exactly as candidates do
- Server-side pagination - Navigate through millions of rows efficiently
- Smart pagination rules - Respects user LIMIT clauses (β€5000 rows) exactly
- Configurable page sizes - Choose 100, 250, 500, or 1000 rows per page
- Performance optimization - Prevents browser freezing with large datasets
- User preferences - Persistent settings for font size and page size
- Font size controls - 5 levels from extra small to extra large for optimal viewing
pip3 install -r requirements.txt
python3 app.py
The application will be available at: http://localhost:5002
- Navigate to Admin Login and enter your authorized email
- Go to Admin Dashboard β "Candidate Invitations"
- Create secure assessment URLs for candidates
- Use "Reseed Evaluation Challenges" to load VisiQuate healthcare scenarios
- Send Invitation: Provide unique URL to candidate
- Monitor Progress: Watch real-time activity in admin dashboard
- Review Analysis: Examine query history, think times, and approach
- Export Report: Generate comprehensive assessment for interview discussion
- Free-form SQL practice with any uploaded data
- Intelligent sample queries based on your schema
- Real-time query execution with results visualization and progress tracking
- Schema browser for table exploration with sample data preview
- Smart pagination - Navigate through large result sets efficiently
- Configurable display - Adjustable font sizes and rows per page (100-1000)
- SQL semantics respect - LIMIT clauses honored exactly as written
- Query performance monitoring - Execution time tracking with 60-second timeout
- 10 healthcare data integrity scenarios - VisiQuate evaluation challenges with professional formatting
- Data validation focus - Account balance, payment reconciliation, crosswalk integrity
- Approach evaluation - Analytical thinking assessment rather than SQL proficiency testing
- Complete activity tracking - Every query, error, and navigation event captured
- Think time measurement - Time between activities shows analytical process
- UTC timezone handling - All timestamps stored in UTC, displayed in user's local time
- Candidate invitation management - Generate secure time-limited assessment URLs
- Real-time activity monitoring - Watch candidate progress with live activity feeds
- Comprehensive analytics - Query success rates, think times, tab switching patterns
- Detailed candidate reports - Complete assessment timeline with query history
- Admin impersonation - Experience assessment exactly as candidates do
data-explorer/
βββ app.py # Main Flask application (refactored & modular)
βββ app_monolithic.py # Original monolithic version (backup)
βββ requirements.txt # Python dependencies
βββ healthcare_quiz.db # Default sample database
βββ user_data.db # User tracking and challenges
βββ models/ # Data models and database operations
β βββ __init__.py # Package initialization
β βββ database.py # Database connections and initialization
β βββ challenges.py # VisiQuate evaluation scenarios
β βββ users.py # User management and candidate tracking
β βββ candidates.py # Invitation system and activity logging
β βββ admin_auth.py # Admin authentication and authorization
βββ routes/ # Route handlers (future expansion)
β βββ __init__.py # Package initialization
βββ utils/ # Utility functions and helpers
β βββ __init__.py # Package initialization
β βββ data_processing.py # CSV processing and schema detection
β βββ query_validation.py # SQL security and validation
β βββ timezone.py # UTC timestamp utilities and browser timezone handling
βββ templates/
β βββ base.html # Base layout with activity tracking
β βββ index.html # Candidate landing page (invitation-only)
β βββ explore.html # Data explorer interface
β βββ upload.html # Data upload interface (admin-only)
β βββ challenges.html # VisiQuate evaluation scenarios
β βββ admin/ # Admin interface templates
β βββ dashboard.html # Admin dashboard with invitation management
β βββ candidates.html # Candidate activity tracking
β βββ candidate_detail.html # Detailed assessment view
β βββ candidate_invitations.html # Invitation URL management
βββ static/
β βββ css/style.css # Custom styles with challenge formatting
β βββ js/
β β βββ app.js # JavaScript utilities
β β βββ timezone.js # Browser timezone conversion utilities
β βββ ...
βββ deploy/ # Docker deployment configs
GET /api/schema
- Get database schema informationGET /api/tables
- List available tablesPOST /api/execute
- Execute SQL queries with comprehensive activity loggingGET /api/sample-queries
- Get sample queries for data explorationPOST /api/log-activity
- Log candidate page visibility and navigation events
GET /api/challenges
- Get healthcare data integrity evaluation scenariosGET /api/challenge/<id>
- Get specific evaluation scenario detailsPOST /api/challenge/<id>/attempt
- Submit evaluation scenario attemptGET /api/user/progress
- Get candidate progress across evaluation scenarios
GET /api/admin/candidates
- Get all candidates with activity summariesGET /api/admin/candidate/<username>/detail
- Complete candidate activity timelineGET /api/admin/analytics
- System-wide evaluation performance analyticsGET /api/admin/export/candidate/<username>
- Export comprehensive assessment report
GET /api/admin/candidates/invitations
- List all candidate invitation URLsPOST /api/admin/candidates/invitations
- Create new secure candidate invitationPOST /api/admin/candidates/invitations/<id>/deactivate
- Deactivate invitation URLGET /api/admin/candidates/<user_id>/activity
- Get detailed candidate activity log
POST /api/admin/impersonate/<user_id>
- Start admin impersonation of candidatePOST /api/admin/end-impersonation
- End active impersonation sessionPOST /api/admin/challenges/reseed
- Update evaluation scenarios with latest content
- Account Balance Validation - Test core business rule: Balance = Charges - Payments - Adjustments
- Claim Date Pattern Analysis - Validate first_claim_bill_date β€ last_claim_bill_date logic
- Data Overview & Exploration - Understand table structures and record counts
- Account vs Transaction Reconciliation - Validate adjustment rollups between tables
- Payment Reconciliation Analysis - Insurance vs patient payment validation across tables
- Multi-dimensional Validation - Test multiple business rules simultaneously
- AR Status & Balance Distribution - Multi-faceted analysis combining rule validation with BI
- Primary Payor Performance Analysis - Payment ratios, account status patterns, and efficiency metrics
- Temporal Pattern Analysis - Service date trends and payor performance over time
- Transaction Crosswalk Integrity - Comprehensive data quality audit with orphan detection
- Advanced Data Quality - Uniqueness validation, unused code identification, pattern analysis
- Complete System Validation - Full end-to-end data integrity assessment
- Analytical Thinking Focus - Evaluates problem-solving approach, not SQL syntax proficiency
- Business Context Understanding - Tests real-world healthcare data integrity scenarios
- Documentation Emphasis - Candidates document findings as they would for clients
- Methodology Assessment - Process and reasoning more important than perfect queries
-- Test business rule: Balance = Charges - Payments - Adjustments
SELECT invoice_id,
balance,
total_charges,
total_payments,
total_adjustments,
(total_charges - total_payments - total_adjustments) as calculated_balance,
(balance - (total_charges - total_payments - total_adjustments)) as variance
FROM hw_accounts
WHERE (balance - (total_charges - total_payments - total_adjustments)) != 0;
-- Validate account-level payments match transaction-level rollups
SELECT a.invoice_id,
a.ins_payments,
COALESCE(SUM(t.total_ins_payments), 0) as trans_ins_payments,
a.pt_payments,
COALESCE(SUM(t.total_pt_payments), 0) as trans_pt_payments
FROM hw_accounts a
LEFT JOIN hw_transactions t ON a.invoice_id = t.invoice_id
GROUP BY a.invoice_id
HAVING a.ins_payments != COALESCE(SUM(t.total_ins_payments), 0)
OR a.pt_payments != COALESCE(SUM(t.total_pt_payments), 0);
-- Analyze open vs closed account patterns by payor and service month
SELECT cur_payor,
strftime('%Y-%m', service_start_date) as service_month,
COUNT(*) as total_accounts,
COUNT(CASE WHEN ar_status = 'Open' THEN 1 END) as open_accounts,
ROUND(100.0 * COUNT(CASE WHEN ar_status = 'Open' THEN 1 END) / COUNT(*), 2) as open_percentage
FROM hw_accounts
WHERE cur_payor IS NOT NULL AND service_start_date IS NOT NULL
GROUP BY cur_payor, service_month
ORDER BY open_percentage DESC;
-- Comprehensive crosswalk integrity analysis with orphan detection
SELECT 'Crosswalk Duplicates' as issue_type,
COUNT(*) as count
FROM (SELECT txn_type_code, txn_sub_type_code, COUNT(*)
FROM hw_trn_codes
GROUP BY txn_type_code, txn_sub_type_code
HAVING COUNT(*) > 1)
UNION
SELECT 'Orphan Transactions' as issue_type,
COUNT(DISTINCT t.txn_sub_type_code)
FROM hw_transactions t
LEFT JOIN hw_trn_codes c ON t.txn_sub_type_code = c.txn_sub_type_code
WHERE c.txn_sub_type_code IS NULL;
- β Overall completion rates and progress visualization
- β Score breakdowns by difficulty level and category
- β Time-to-completion analysis across challenges
- β Hint usage patterns and help-seeking behavior
- β Query evolution and problem-solving approaches
- β Challenge difficulty rankings based on success rates
- β Performance trends across candidate pool
- β Most challenging problems identification
- β Average execution times and optimization opportunities
- β Candidate activity patterns and engagement metrics
- β Individual Reports: Complete candidate assessment with query history
- β Comparative Analysis: Performance relative to candidate pool
- β Skill Mapping: Strengths and weaknesses by SQL concept
- β Progression Tracking: Improvement over time and attempts
- β Export Formats: JSON reports for external analysis
- UTC Timezone Handling - All timestamps stored in UTC, displayed in user's browser timezone
- Enhanced Challenge Formatting - Professional HTML presentation with clear section labels
- Admin Session Improvements - Fixed 500 errors, enhanced impersonation and candidate detail access
- Browser Timezone Display - Automatic conversion of UTC times to local timezone preferences
- Professional Assessment Layout - Clear separation of Data Set, Scenario, Rule/Constraint, and Task sections
- Secure Candidate Invitation System - Time-limited unique URLs with usage tracking and admin impersonation
- Comprehensive Activity Tracking - Every query, error, navigation event with think time analysis
- Tab Switching Detection - Monitor when candidates navigate away (potential external consultation)
- VisiQuate Healthcare Scenarios - 10 real-world data integrity evaluation challenges
- Professional Assessment Interface - Focus on analytical approach over SQL proficiency
- Admin Activity Dashboard - Complete candidate timeline with query history and performance analytics
- High-performance CSV processing - Optimized for 150K+ row datasets with column type caching
- Resilient authentication system - Graceful degradation when user database unavailable
- Query validation improvements - Proper support for SQL comments in SELECT statements
- Duplicate column handling - Automatic renaming of duplicate CSV column headers
- Database schema migrations - Robust handling of existing database upgrades
- Production deployment fixes - Resolved container permissions and initialization issues
- UI visibility enhancements - Fixed dark theme code examples and error templates
- Smart server-side pagination - Navigate through millions of rows efficiently
- SQL semantics compliance - Respects user LIMIT clauses (β€5000) exactly as written
- Configurable page sizes - Choose 100, 250, 500, or 1000 rows per page with persistent preferences
- Font size controls - 5 adjustable levels (XS to XL) for optimal data viewing
# Start development environment
make dev
# Run tests
make test
# View logs
make logs
# Access container shell
make shell
# Automated deployment via GitHub Actions
git push origin main
# Manual deployment
docker compose up -d
# Check status
docker compose ps
- Database Engine: SQLite for fast, embedded operations
- Query Performance: Sub-second execution for most operations with 60-second timeout
- Smart Pagination: Server-side pagination for large result sets (up to millions of rows)
- File Upload: Handles large CSV files (150K+ rows) with streaming processing and column type caching
- Browser Optimization: Prevents freezing with configurable page sizes (100-1000 rows)
- Concurrent Users: Optimized for interview scenarios with efficient resource management
- Read-only database access for candidate queries
- Query validation blocks dangerous SQL operations
- Input sanitization prevents SQL injection attacks
- UTF-8 BOM cleaning prevents hidden character issues
- Container security with non-root user execution
- UTC Storage: All timestamps stored consistently in UTC using
utils/timezone.py
- Browser Display: Automatic conversion to user's local timezone via
static/js/timezone.js
- Activity Tracking: Precise timing with timezone-aware calculations
- Professional Timestamps: Clear date/time formatting with timezone indicators
- Think Time Accuracy: UTC-based calculations ensure accurate time measurements across timezones
- Modern Browsers: Chrome, Firefox, Safari, Edge (latest versions)
- Mobile Support: Touch-friendly responsive design
- Accessibility: WCAG 2.1 AA compliance
- Progressive Web App: Offline capability and installable
- Admin Setup: Login with authorized email ([email protected], [email protected], [email protected])
- All admins have full access to impersonation and candidate detail functions
- Enhanced error handling prevents 500 errors during admin operations
- Create Invitation: Generate secure candidate URL with expiration date via Admin Dashboard β Candidate Invitations
- Send to Candidate: Provide unique assessment URL (expires automatically to prevent sharing)
- Real-time Monitoring: Watch candidate progress, query attempts, and think times through admin dashboard
- Review Analysis: Examine complete activity timeline including tab switching and analytical approach
- Export Report: Generate comprehensive assessment for interview discussion
- Access Assessment: Use provided unique URL (no registration required)
- Understand Context: This evaluates analytical thinking, not SQL proficiency
- Document Approach: Save queries and document findings for interview discussion
- Focus on Method: Emphasis on problem-solving process and business insights
- Use Resources: Schema reference and business rules provided within assessment
- Data Integrity Validation: Testing business rules and identifying violations
- Cross-Table Reconciliation: Validating rollups between related tables
- Pattern Recognition: Identifying trends in AR status, payor performance, and date patterns
- Data Quality Assessment: Finding orphan records, duplicates, and data inconsistencies
- Business Analysis: Understanding healthcare finance concepts (AR, claims, payments)
- Analytical Documentation: Summarizing findings as you would for clients
- User-uploaded data stays local to your deployment
- No external data transmission except for application functionality
- SQLite local storage with configurable retention policies
- Assessment data tracking with anonymization options
- Container isolation with minimal attack surface
- Read-only query execution prevents data modification
- Input validation at multiple application layers
- Secure deployment with Cloudflare tunnel integration
We welcome contributions! Please see CONTRIBUTING.md for detailed guidelines on:
- Development setup and coding standards
- Pull request process and review guidelines
- Testing requirements and security considerations
- Feature development areas and technical improvements
- Bug reporting and performance optimization
Key areas for enhancement:
- Challenge Library: Add more domain-specific problems
- UI/UX Improvements: Enhanced candidate experience
- Analytics Features: Advanced performance insights
- Integration Capabilities: HR system integrations
- Security Enhancements: Additional query validation
- Performance Optimization: Query execution improvements
- Module Refactoring: Break app.py into focused modules β
- Custom Challenge Creation: Admin interface for creating new challenges
- Team Assessment: Multi-candidate comparison tools
- API Integrations: Connect with ATS/HR systems
- Advanced Analytics: Machine learning insights
- Mobile App: Native mobile assessment experience
- UTC Timezone System: All timestamps stored in UTC with browser-local display β
- Enhanced Admin Access: Fixed 500 errors for impersonation and candidate details β
- Challenge Formatting: Professional HTML presentation with clear section labels β
- Database Optimization: CSV processing performance enhancements β
- Authentication System: Resilient authentication with graceful degradation β
- Schema Migration: Robust database schema updates β
- Query Validation: Enhanced security with comment support β
- Caching Layer: Redis integration for better performance
- Role-based Access Control: Admin/candidate permission levels
- Audit Logging: Enhanced activity tracking
- Backup Systems: Automated data protection
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Professional SQL skills assessment made simple ππΌπ
Transform any CSV data into interactive SQL assessment experiences