This case study details the design and implementation of a high-throughput dermatology image annotation platform that significantly increased annotation efficiency. The system enables dermatologists to classify skin conditions and provide rich-text descriptions through a streamlined workflow with quality control measures. As the architect and technical lead, I designed the complete system architecture and directed the development team while personally implementing the database layer using Flyway for migration management. The platform demonstrates my ability to lead full-stack projects that solve complex domain-specific problems while prioritizing user experience and system efficiency.
Dermatological image analysis requires expert annotation for both clinical decision support and machine learning model development. Existing annotation tools lacked domain-specific workflows and efficiency features needed by dermatologists. This project addressed the need for a specialized platform that could support both classification of skin conditions and detailed descriptive annotations while maintaining high throughput and quality standards.
- Generic annotation tools lacked dermatology-specific classification taxonomies
- Previous workflows caused bottlenecks in the annotation review process
- Expert dermatologist time was being used inefficiently
- Quality control mechanisms were insufficient for medical applications
- Frontend: React with TypeScript for type safety and component reusability
- Backend: Python FastAPI service for annotation logic and workflow management
- Database: PostgreSQL with JSON support for flexible annotation schemas
- Migration: Flyway for version-controlled database schema evolution
- Authentication: OAuth 2.0 with role-based access control
- Image Processing: Integrated image processing pipeline for preprocessing and standardization
- Annotation engine with dermatology-specific classification taxonomy
- Multi-stage review workflow with quality assurance gates
- Admin configuration tools for taxonomy and workflow adjustment
- Analytics dashboard for throughput and quality metrics
The annotation platform implements a structured workflow optimized for dermatology image annotation with built-in quality control and AI-assisted features:
stateDiagram-v2
[*] --> AdminWorkflow
[*] --> AnnotationWorkflow
state AdminWorkflow {
[*] --> UploadImages
UploadImages --> ConfigureRules: define class categories
ConfigureRules --> SetLimits: control work item creation
SetLimits --> SetPriority: prioritize critical cases
SetPriority --> CreateWorkItems: generate annotation tasks
CreateWorkItems --> [*]: items in 'new' status
}
state AnnotationWorkflow {
[*] --> FetchAndLock: annotator login
FetchAndLock --> ProcessImage: assign to annotator
ProcessImage --> AIAssistance: run through AI pipeline
state AIAssistance {
[*] --> ImageCropping: external API
ImageCropping --> ClassPrediction: custom Sagemaker model
ClassPrediction --> LLMRecommendation: provider next steps
LLMRecommendation --> [*]: display suggestions
}
AIAssistance --> AnnotatorEditing: review & modify suggestions
state AnnotatorEditing {
[*] --> EditAnnotation
EditAnnotation --> SaveChanges
EditAnnotation --> RejectImage: flag problematic image
SaveChanges --> [*]: to 'review' status
RejectImage --> [*]: mark for removal
}
AnnotatorEditing --> QualityReview: submit for QC
state QualityReview {
[*] --> ReviewerDecision
ReviewerDecision --> Approve: accept annotation
ReviewerDecision --> RequestChanges: reject annotation
Approve --> [*]: set 'done' status
RequestChanges --> [*]: set 'edits_requested'
}
QualityReview --> FetchAndLock: if edits requested (same annotator)
QualityReview --> [*]: if approved
}
-
Administrative Preparation
- Admin uploads dermatology images via dashboard, organizing by classification
- Creates work rules with limit settings to control annotation volume (cost management)
- Implements priority structure to ensure critical cases are handled first
- System generates work items in 'new' status based on these rules
-
Annotation Assignment
- Dermatologist logs in and system locks available items (status → 'in_progress')
- System associates the work item with the specific annotator
-
AI-Assisted Preprocessing
- Image passes through production AI pipeline:
- External API for image cropping/standardization
- Custom-trained Sagemaker model for preliminary classification
- LLM API integration for generating "provider recommended next steps"
- Results stored in ai_suggestion table and presented to annotator as starting point
- Image passes through production AI pipeline:
-
Annotation Process
- Annotator reviews and edits AI-generated suggestions as needed
- Changes saved to annotation_revision table with proper versioning
- Option to reject problematic images (records in work_item_flag table)
- Upon completion, status changes to 'review'
-
Quality Control Review
- QC specialist reviews completed annotations
- Can either approve (status → 'done') or request revisions (status → 'edits_requested')
- If revisions needed, work item returns only to original annotator
- Each revision stored with sequential revision_seq tracking
-
Continuous Improvement
- Flagged images processed by backend cleaner for dataset quality maintenance
- Annotation quality metrics tracked for annotator performance evaluation
- Classification System: Hierarchical taxonomy of dermatological conditions with searchable interface
- Rich-Text Annotation: Structured descriptors for lesion characteristics (color, texture, borders, etc.)
- Region Marking: Optional highlighting of regions of interest with measurement tools
- Reference Library: Integrated access to reference images and classification guidelines
- Role-Based Access: Distinct permissions for primary annotators, reviewers, and administrators
- Work Distribution: Intelligent task assignment based on annotator expertise and workload
- Performance Tracking: Individual and team productivity metrics with quality indicators
- Multi-Stage Review: Two-tier review process with expert validation
- Consistency Checks: Automated detection of annotation inconsistencies
- Auto-Assistance: ML-powered suggestions to accelerate annotation (pre-classification)
- Annotation Standards: Enforced structured input for key classification attributes
- Taxonomy Management: Interface for updating classification schemas
- Workflow Configuration: Customizable rules for task assignment and review paths
- Quality Reporting: Comprehensive dashboards for throughput and accuracy
- Export Capabilities: Structured data export for research and model training
The database architecture was personally designed and implemented with a focus on flexibility, performance, and data integrity. PostgreSQL was selected for its robust JSON capabilities while maintaining relational integrity.
- Hybrid Schema: Core entities in normalized tables with flexible annotation content in JSON
- Optimized Queries: Indexed JSON paths for efficient filtering and searching
- Audit Trail: Comprehensive change tracking with temporal queries
- Trigger Functions: Automate state changes according to dataflow within the schema itself
- Version Control: All schema changes managed through Flyway migrations
The API layer was architected using RESTful principles with a domain-driven approach, clearly separating core domain logic, workflow management, and administrative operations. The design emphasized modularity and proper separation of concerns while ensuring robust authentication and authorization throughout.
-
Resource-Oriented Endpoints
/annotations/*- Core annotation operations (creation, updates, search)/tasks/*- Task management and workflow state transitions/review/*- Quality control and approval processes/admin/*- Administrative configuration and monitoring
-
Workflow-Specific Endpoints
- Task assignment and locking
- Review queue management
- State transitions with proper validation
- Heartbeat mechanisms to maintain task locks
-
Administrative Endpoints
- Rule-based task prioritization configuration
- Queue management and monitoring
- User task management and reassignment
- Performance and quality metrics
// Example route handler showcasing key architectural patterns
router.post('/tasks/:task_id/submit', authenticate, async (req, res) => {
const client = await pool.connect();
try {
// Transaction management
await client.query('BEGIN');
// Authorization and validation
const userId = req.user.id;
const { task_id } = req.params;
const { annotation_data } = req.body;
// Row-level locking prevents race conditions
const task = await validateTaskOwnership(client, task_id, userId);
// Domain logic execution with proper state transition
const result = await taskService.submitForReview(client, {
taskId: task_id,
userId,
annotationData: annotation_data
});
// Audit logging for compliance
await auditLogger.recordAction(client, {
userId,
action: 'TASK_SUBMITTED',
resourceId: task_id,
metadata: { previousStatus: task.status }
});
await client.query('COMMIT');
// Event emission for asynchronous processing
eventEmitter.emit('task.submitted', { task_id, userId });
return res.json({
success: true,
data: result
});
} catch (err) {
await client.query('ROLLBACK');
return errorHandler(res, err);
} finally {
client.release();
}
});- JWT-based Authentication: Secure access control with role-based permissions
- Transactional Integrity: Database transactions for multi-step operations
- Optimistic Concurrency: Row-level locking for task assignment
- Event-Driven Architecture: Events emitted for asynchronous processing
- Comprehensive Error Handling: Structured error responses with appropriate status codes
- Audit Logging: All significant actions logged for compliance and debugging
- Needs efficient interface to maximize throughput
- Requires specialized terminology and classification options
- Values consistency and quick access to reference materials
- Focuses on quality control and consistency
- Needs clear comparison views between annotations and source images
- Requires efficient feedback mechanisms
- Manages workflow rules and user assignments
- Monitors system performance and annotation quality
- Configures classification taxonomies and standards
-
Primary Annotation Flow
- Login and dashboard showing assigned tasks and performance metrics
- Task selection or automatic assignment based on rules
- Image viewing with zoom and pan capabilities
- Classification selection through hierarchical menu
- Structured descriptor input with autocomplete
- Submission for review with optional notes
-
Review Process
- Queue of submitted annotations requiring review
- Side-by-side comparison of image and annotation
- One-click approval or detailed revision requests
- Quality metrics tracking for annotators
-
Administrative Workflow
- Dashboard with system-wide metrics
- User management and task assignment rules
- Taxonomy editing and version control
- Batch operations for data export and analysis
- Privacy by Design: System designed with no patient PII exposure within the application
- Authentication: Google OAuth implementation for secure user identity verification
- Authorization: Role-based access control for annotators, reviewers, and administrators
- Audit Trail: Comprehensive activity logging of all annotation actions with timestamps and user attribution
- Medical Data Handling: Ensured compliance with medical data regulations through proper separation of identifiers
- User Accountability: Each annotation action traced to specific users for quality assurance
- Versioning: Complete revision history maintained for all annotations, preserving the full edit history
- Materialized Views: Implemented for commonly accessed data combinations to reduce join overhead
- Strategic Indexing: Table indices created for frequent query patterns across annotation workflows
- Query Tuning: Optimized key workflow queries for annotation assignment and review processes
- Lightweight Design: System architecture balanced for the specific user load requirements
- Resource Allocation: Appropriate database configuration for the anticipated concurrent user count
- Single User to Multi-User: Successfully transformed a single-annotator prototype into a full multi-stage workflow system
- Data Structure Redesign: Reorganized underlying data model to support the expanded workflow requirements
- Rejection Pathways: Implemented multiple rejection options for problematic images
- Source Truth Preservation: Ensured that approval/rejection decisions were recorded in the source dataset
- Image Quality Control: Added specific workflows for identifying incorrectly categorized or unclear images
- Downstream Integration: Maintained data integrity for machine learning model training purposes
- Queue Metrics: Real-time visualization of work items in various states
- User Productivity: Tracking of annotation completion rates by user over configurable time periods
- Quality Indicators: Monitoring of review outcomes (acceptance vs. rejection rates)
- Error Tracking: Integration with third-party logging services for error aggregation
- Alert System: Developer notifications for critical system errors or anomalies
- Performance Monitoring: Continuous tracking of response times and system resource utilization
- Throughput Increase: Significant reduction in time required per annotation
- Quality Enhancement: Improved consistency and accuracy through structured input
- Resource Optimization: More effective use of specialist dermatologist time
- Learning Curve: Reduced training time for new annotators through intuitive interface
- Scalable Architecture: System designed to handle growing image repositories
- Data Quality: Robust validation and review processes ensuring high-quality annotations
- Integration Capability: APIs designed for seamless integration with ML pipelines
- Maintainability: Well-documented, version-controlled database through Flyway
- Specialist-Led Training: Senior dermatologists conduct training for new annotators
- Documentation: Comprehensive user guides and workflow documentation
- Continuous Learning: Periodic feedback sessions to refine the annotation process
- Enhanced CI/CD Pipeline: Streamline development and deployment processes
- Comprehensive Testing Strategy: Implement robust automated testing frameworks
- ML Integration: Direct pipeline from annotations to model training datasets
- Additional Annotation Types: Support for bounding box annotations and temporal data selection
- Workflow Customization: Configurable workflows for different annotation needs
- Advanced Analytics: More sophisticated reporting on annotation quality and efficiency
This annotation platform demonstrates my ability to lead the end-to-end design and implementation of complex, domain-specific applications. By taking ownership of the architecture and database design while directing the development team, I delivered a solution that significantly improved annotation throughput for dermatology images.
The project showcases my technical expertise in database design, API development, and workflow optimization, alongside leadership skills in translating domain requirements into technical specifications and guiding implementation. The successful deployment and adoption of this platform highlight my capability to deliver full-stack solutions that provide meaningful business value through technical innovation.