-
Notifications
You must be signed in to change notification settings - Fork 12
Making the dockerfile and code ready for kagent. #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
simonprovost
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much @papagala ! Here are some preliminary comments prior @rafiattrach more important review 🫡
Cheers!
| """Main entry point for MCP server.""" | ||
| # Run the FastMCP server | ||
| mcp.run() | ||
| # Check if we should run in HTTP mode (for Kubernetes/Docker) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline comments are difficult to maintain in OSS, much better to add a Notes section via docstrings of any functions/classes/ and so on I'd say! Let's see if it's fine with @rafiattrach though!
~> Note that this point is only for py-based-files.
PS: I am aware that the codebase has seen such inline comments; I believe we already discussed some tech debts with Rafi and the team, let's try to avoid including more, if I may say?
Cheers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course it repeats many times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point about consistency, some functions might be over-commented while others could use more context. I think both docstrings and inline comments serve their purpose though - wouldn't really call it tech debt, more like style standardization. Happy to discuss commenting standards as a team though
| ## 🤖 AI Agent Integration | ||
|
|
||
| ### Agent Instructions | ||
|
|
||
| Copy these instructions into your AI agent configuration for optimal MIMIC-IV querying: | ||
|
|
||
| **Core Workflow:** | ||
| 1. Always start with schema discovery using get_database_schema() | ||
| 2. Use get_table_info(table_name) to see columns and sample data | ||
| 3. Check sample data for actual formats (dates, column names, values) | ||
| 4. Write queries with proper JOINs and LIMIT clauses | ||
| 5. Provide context and interpretation with results | ||
|
|
||
| **Key Tables:** | ||
| - patients: Demographics (subject_id, gender, anchor_age, anchor_year) | ||
| - admissions: Hospital stays (hadm_id, admittime, dischtime, admission_type) | ||
| - icustays: ICU episodes (stay_id, intime, outtime, los) | ||
| - labevents: Lab results (itemid, value, valuenum, valueuom) | ||
| - prescriptions: Medications (drug, dose_val_rx, route) | ||
|
|
||
| **Best Practices:** | ||
| - Always use LIMIT to prevent returning too many rows | ||
| - Verify column names from sample data (e.g., 'anchor_age' not 'age') | ||
| - Handle NULLs explicitly in clinical data | ||
| - Use convenience functions for common patterns | ||
| - Explain results, don't just dump data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that not redundant with previous sections? I may be mistaken!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if possible @papagala we can remove redundant things especially in README since it quickly gets bloated with repetitive things especially regarding prompting or examples
| - 🌐 **Multi-tenant Support**: Organization-level data isolation | ||
|
|
||
| ## Contributing | ||
| ## 🐳 Kubernetes Deployment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn’t we start using collapsible, details-based sections in the README—keeping only the ultimate essentials for end users (such as clinicians), and placing variations like Kubernetes deployment inside collapsible sections?
Would you mind trying @papagala ?
It's becoming pretty long!
@rafiattrach Would you agree with this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea good idea but perhaps we can keep that for a followup separate PR since this isn't the only section that needs such a change
| ### Sample Questions | ||
|
|
||
| **Basic Exploration:** | ||
| - What tables are available in the database? | ||
| - Show me the structure of the patients table | ||
| - Give me a sample of 5 rows from the icustays table | ||
|
|
||
| **Patient Analysis:** | ||
| - How many patients are in the database? | ||
| - Show me the age and gender distribution | ||
| - What's the average ICU length of stay? | ||
|
|
||
| **Clinical Queries:** | ||
| - Show me patients with diabetes diagnoses | ||
| - What are the most common admission types? | ||
| - Find patients with both high glucose and kidney problems | ||
| - Compare ICU length of stay between emergency and elective admissions | ||
|
|
||
| **Deep Dives:** | ||
| - Give me the complete medical history for patient 10001 | ||
| - Show all ICU stays, diagnoses, and medications for patient 10006 | ||
| - What were the lab trends during a patient's last admission? | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup I think this is redundant
| # Makefile for M3 Docker Image Build and Push | ||
| DOCKER ?= docker | ||
| IMAGE_NAME := m3-mimic-demo | ||
| IMAGE_TAG ?= 0.0.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I wrong in asking why 0.0.3 please @papagala ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps 0.3.0 was meant since that's latest version on github/pypi? in any case perhaps we can already set 0.4.0 since we can bump it up after it gets merged
rafiattrach
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs a rebase first, and the other big PR should get merged faster since it's nearly done. I can test and review this after that merges and after the rebase. Thanks for the addition and contribution!
| # Makefile for M3 Docker Image Build and Push | ||
| DOCKER ?= docker | ||
| IMAGE_NAME := m3-mimic-demo | ||
| IMAGE_TAG ?= 0.0.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps 0.3.0 was meant since that's latest version on github/pypi? in any case perhaps we can already set 0.4.0 since we can bump it up after it gets merged
| - 🌐 **Multi-tenant Support**: Organization-level data isolation | ||
|
|
||
| ## Contributing | ||
| ## 🐳 Kubernetes Deployment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea good idea but perhaps we can keep that for a followup separate PR since this isn't the only section that needs such a change
| """Main entry point for MCP server.""" | ||
| # Run the FastMCP server | ||
| mcp.run() | ||
| # Check if we should run in HTTP mode (for Kubernetes/Docker) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point about consistency, some functions might be over-commented while others could use more context. I think both docstrings and inline comments serve their purpose though - wouldn't really call it tech debt, more like style standardization. Happy to discuss commenting standards as a team though
| ## 🤖 AI Agent Integration | ||
|
|
||
| ### Agent Instructions | ||
|
|
||
| Copy these instructions into your AI agent configuration for optimal MIMIC-IV querying: | ||
|
|
||
| **Core Workflow:** | ||
| 1. Always start with schema discovery using get_database_schema() | ||
| 2. Use get_table_info(table_name) to see columns and sample data | ||
| 3. Check sample data for actual formats (dates, column names, values) | ||
| 4. Write queries with proper JOINs and LIMIT clauses | ||
| 5. Provide context and interpretation with results | ||
|
|
||
| **Key Tables:** | ||
| - patients: Demographics (subject_id, gender, anchor_age, anchor_year) | ||
| - admissions: Hospital stays (hadm_id, admittime, dischtime, admission_type) | ||
| - icustays: ICU episodes (stay_id, intime, outtime, los) | ||
| - labevents: Lab results (itemid, value, valuenum, valueuom) | ||
| - prescriptions: Medications (drug, dose_val_rx, route) | ||
|
|
||
| **Best Practices:** | ||
| - Always use LIMIT to prevent returning too many rows | ||
| - Verify column names from sample data (e.g., 'anchor_age' not 'age') | ||
| - Handle NULLs explicitly in clinical data | ||
| - Use convenience functions for common patterns | ||
| - Explain results, don't just dump data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if possible @papagala we can remove redundant things especially in README since it quickly gets bloated with repetitive things especially regarding prompting or examples
Add Kubernetes Deployment Support and Build Automation
Summary
This PR adds Kubernetes/Docker deployment capabilities to M3, enabling the MCP server to run in containerized environments with HTTP transport. It also includes build automation via Makefile and comprehensive documentation for AI agent integration.
Changes
🐳 Dockerfile Enhancements
MCP_TRANSPORT=http- enables HTTP mode instead of STDIOMCP_HOST=0.0.0.0- binds to all interfaces for container networkingMCP_PORT=3000- exposes MCP server on port 3000MCP_PATH=/sse- configures server-sent events endpoint🔧 MCP Server Transport Flexibility (src/m3/mcp_server.py)
MCP_TRANSPORTenvironment variablestreamable-httptransport for Kubernetes📦 Build Automation (New Makefile)
DOCKERvariabledownload-db- downloads MIMIC-IV demo database usinguvbuild/build-bigquery- builds lite and BigQuery Docker imagespush/push-bigquery- pushes images to registrytest-image- validates built imageclean- removes downloaded database filesIMAGE_TAG(default: 0.0.3)📚 Documentation (README.md)
Usage Examples
Build and Deploy