-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Resolve Issue #204: Complete TFRecord Generation Solution for Learning to Simulate #646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
nsrawat0333
wants to merge
7
commits into
google-deepmind:master
Choose a base branch
from
nsrawat0333:solve-tfrecord-generation-204
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Resolve Issue #204: Complete TFRecord Generation Solution for Learning to Simulate #646
nsrawat0333
wants to merge
7
commits into
google-deepmind:master
from
nsrawat0333:solve-tfrecord-generation-204
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Update aiohttp to address potential security vulnerabilities - Maintains compatibility with existing codebase - Addresses dependency security recommendations
…-deepmind#588 - Create download_polygen_models.py script for automated model downloading - Add comprehensive documentation for pre-trained model access - Provide multiple download methods (Python script, gsutil, wget) - Add troubleshooting section addressing Issue google-deepmind#588 confusion - Create requirements-download.txt for download dependencies Addresses Issue google-deepmind#588: 'where is the face_model.tar and the vertices_model.tar' The issue was caused by: 1. Unclear documentation about model file locations 2. Confusion about file names (face_model.tar.gz vs face_model.tar) 3. No clear download instructions outside of Colab environment 4. Missing troubleshooting guidance Solutions provided: 1. Python download script with progress bars and verification 2. Clear documentation of all download methods 3. Correct file names and locations specified 4. Comprehensive troubleshooting section 5. Multiple fallback options for different environments Users can now easily access PolyGen pre-trained models using: - Automated Python script (recommended) - Manual gsutil commands - Direct HTTP downloads - Built-in verification and error handling
- Fixed URL construction in download_dataset.sh to prevent double slashes - Added comprehensive Python download script with progress tracking - Enhanced error handling and validation for dataset downloads - Updated README with alternative download methods and troubleshooting - Added requirements-download.txt for download dependencies Key improvements: Proper URL construction: Fixed BASE_URL to avoid double slash issue Python downloader: Cross-platform solution with progress bars Error handling: Clear error messages for 404 and network issues Dataset validation: Verify all required files are present User experience: List datasets, verify downloads, detailed progress Addresses Issue google-deepmind#596 where users reported 404 errors when downloading MeshGraphNet datasets. Multiple users confirmed this issue affecting research reproducibility. Files changed: - meshgraphnets/download_dataset.sh: Fixed URL construction and added validation - meshgraphnets/download_meshgraphnet_datasets.py: New Python download tool - meshgraphnets/README.md: Updated with alternative download methods - meshgraphnets/requirements-download.txt: Download dependencies
- Fixed broken S3 download URLs in scripts/download.sh - Added comprehensive Python download script with progress tracking - Enhanced error handling and dataset verification - Updated README with alternative download methods and troubleshooting - Added requirements-download.txt for download dependencies Key improvements: Working URLs: Replaced broken S3 amazonaws URLs with working wikitext.smerity.com URLs Python downloader: Cross-platform solution with progress bars and error handling Dataset verification: Ensure all required files are present and valid Modular downloads: Download WikiText-103 and Freebase separately or together User experience: Clear error messages, progress tracking, automatic verification Root cause analysis: The original script used S3 URLs (https://s3.amazonaws.com/research.metamind.io/wikitext/) which are no longer accessible, causing 404 errors and missing wiki.train.tokens files. Fixed by using alternative working URLs from wikitext.smerity.com. Addresses Issue google-deepmind#575 where PhD student reported FileNotFoundError: '/tmp/data/wikitext-103/wiki.train.tokens' blocking research work. Files changed: - wikigraphs/scripts/download.sh: Fixed S3 URLs to working alternatives - wikigraphs/scripts/download_wikigraphs_datasets.py: New Python download tool - wikigraphs/README.md: Updated with alternative download methods - wikigraphs/requirements-download.txt: Download dependencies Credit: Solution inspired by pgemos/deepmind-research fork with working URLs.
…ixes google-deepmind#569 - Added airfoil dataset to download script (addresses Issue google-deepmind#569) - Created comprehensive DATASETS.md guide with all dataset information - Updated README.md with complete dataset listing and download methods - Enhanced dataset descriptions with research applications and use cases Key improvements: Airfoil dataset access: Added missing 'airfoil' dataset to available downloads Comprehensive documentation: Complete guide covering all 10 MeshGraphNets datasets Research context: Detailed descriptions for each dataset with CFD, cloth, and structural categories Usage examples: Training commands, evaluation, and visualization for each dataset type Troubleshooting: Common issues, download sizes, and solution guidance Dataset categories added: - Fluid Dynamics (CFD): airfoil, cylinder_flow - Cloth/Structural Dynamics: flag_simple, flag_minimal, flag_dynamic, flag_dynamic_sizing - Structural Mechanics: deforming_plate, sphere_simple, sphere_dynamic, sphere_dynamic_sizing Addresses Issue google-deepmind#569 where user (MatthewRajan-WA) requested access to AirFoil Steady State dataset mentioned in MeshGraphNets paper for research purposes. Files changed: - meshgraphnets/download_meshgraphnet_datasets.py: Added airfoil dataset option - meshgraphnets/DATASETS.md: New comprehensive dataset guide - meshgraphnets/README.md: Enhanced with complete dataset information Impact: Enables researchers to access all MeshGraphNets datasets for CFD, cloth simulation, and structural mechanics research as referenced in the original paper.
…e Remeshing Explanation - Add detailed technical explanation of adaptive remeshing mechanics - Address core questions about node count changes during remeshing - Explain training procedures with variable mesh topology - Demonstrate ground truth interpolation for loss computation - Include working Python demo showing concepts in action - Provide mathematical formulations for loss with topology changes - Show how SIZE node type enables sizing field prediction - Complete solution addressing all confusion in Issue google-deepmind#519 Files added: - ADAPTIVE_REMESHING_EXPLAINED.md: Comprehensive technical documentation - remeshing_demo.py: Working demonstration script - ISSUE_519_SOLUTION.md: GitHub issue response This resolves the research community's questions about: 1. Node count changes during remeshing (YES, they change) 2. Remeshing during training (YES, for *_sizing datasets) 3. Loss computation with variable topology (ground truth interpolation) 4. Implementation details and mathematical formulations
…ion for Learning to Simulate - Add comprehensive TFRecord dataset generation script (500+ lines) - Include TFRecord reader tool for debugging garbled code issue - Provide detailed documentation with format explanation - Address all questions from long GitHub conversation thread - Enable custom cloth simulation dataset creation - Implement proper statistics calculation (vel_mean, vel_std, acc_mean, acc_std) - Support step_context for global features - Include sample dataset generation for testing Files added: - generate_tfrecord_dataset.py: Complete generation pipeline - tfrecord_reader_example.py: Human-readable TFRecord inspection - TFRECORD_GENERATION_GUIDE.md: Comprehensive documentation (2000+ words) - ISSUE_204_SOLUTION.md: GitHub response summary - requirements-tfrecord.txt: Dependencies specification Key technical contributions: 1. Solves 'garbled code' issue - TFRecord files are binary format 2. Provides statistics calculation matching paper methodology 3. Enables custom physics simulation dataset creation 4. Addresses error accumulation and step_context questions 5. Complete workflow from simulation data to trained model This resolves all questions from the extensive conversation between @cwl1999, @alvarosg, @Social-Mean, @oasis-asu, @yours612, @yq60523 spanning 3+ years of discussion about TFRecord generation.
|
Dooray! Failure Notice
Failure Notice
Your message sent to
***@***.***)
has failed to be delivered.
Please refer to the below for details.
* Recipient :
***@***.***)
* Sent time :
2025-08-11T05:40:20
* Subject :
[google-deepmind/deepmind-research] Resolve Issue #204: Complete TFRecord Generation Solution for Learning to Simulate (PR #646)
* Remote host said :
Your mail was denied from the receiver.
This message was sent from a notification-only address that cannot accept incoming email.
For more information, please contact ***@***.***
© Dooray!.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🔬 Resolves Issue #204: TFRecord Generation for Learning to Simulate
This PR provides a comprehensive solution to the long-standing Issue #204, addressing the 3+ year conversation thread about generating
train.tfrecordfiles for custom physics simulations.❓ Original Questions Resolved:
@cwl1999: "Can you provide the generated data train.tfrecord Source dataset file? When I forcibly open it, I can only see the garbled code."
@yours612: "How are vel_mean, vel_std, acc_mean, and acc_std calculated in metadata?"
@Social-Mean: "How can I create such a test.tfrecord file?"
@yq60523: Multiple questions about step_context, statistics computation, and error accumulation
📦 Solution Components:
1.
generate_tfrecord_dataset.py(500+ lines)2.
tfrecord_reader_example.py(300+ lines)3.
TFRECORD_GENERATION_GUIDE.md(2,000+ words)4.
requirements-tfrecord.txt+ISSUE_204_SOLUTION.md🔬 Key Technical Contributions:
TFRecord Format Explanation: