GPT-Fabric++: Improved Bimanual Fabric Manipulation via Pre-trained foundation models

Gayathri Rajesh, Vedant Raval, Enyu Zhao, Daniel Seita

*Equal contribution.

This is a python implementation of the paper "GPT-Fabric-plus-plus: Improved Bimanual Fabric Manipulation via Pre-trained foundation models". This repository contains the code for fabric folding. All experiments have been simulated in SoftGym. The code for performing fabric smoothing can be found here.

Private repo for GPT-Fabric-plus-plus Folding

Installation

I strongly recommend the user to go through this wonderful blog written by Daniel Seita on setting up SoftGym.

Clone the repository. Run the command conda env create -f environment.yml to create the gptfab-folding environment.
If you are using Ubuntu 22.04 then run the following commands to recompile SoftGym

docker run \
    -v PATH_TO_GPT_FABRIC:/workspace/GPT-Fabric-Smoothing \
    -v PATH_TO_CONDA:PATH_TO_CONDA \
    -v /tmp/.X11-unix:/tmp/.X11-unix \
    -e DISPLAY=$DISPLAY \
    -e QT_X11_NO_MITSHM=1 \
    -it xingyu/softgym:latest bash

If you are using other versions of Ubuntu then use the following commands:

nvidia-docker run \
    -v PATH_TO_GPT_FABRIC:/workspace/GPT-Fabric-Smoothing \
    -v PATH_TO_CONDA:PATH_TO_CONDA \
    -v /tmp/.X11-unix:/tmp/.X11-unix \
    --gpus all \
    -e DISPLAY=$DISPLAY \
    -e QT_X11_NO_MITSHM=1 \
    -it xingyu/softgym:latest bash

Following this you should enter the docker. Run the following commands inside the docker:

root@9ac1efa91ca9:/workspace# cd GPT-Fabric-Smoothing /
root@9ac1efa91ca9:/workspace/GPT-Fabric-Smoothing# export PATH="PATH_TO_CONDA/bin:$PATH"
root@9ac1efa91ca9:/workspace/GPT-Fabric-Smoothing# . ./prepare_1.0.sh 
(gptfab-smoothing) root@9ac1efa91ca9:/workspace/GPT-Fabric-Smoothing# . ./compile_1.0.sh

If the compilation is successful, you should see the following message at the end

[100%] Linking CXX shared module pyflex.cpython-38-x86_64-linux-gnu.so
[100%] Built target pyflex

You can quit the docker by typing exit

In the regular command line, use the following commands

conda activate gptfab-smoothing
export PYFLEXROOT=${PWD}/PyFlex
export PYTHONPATH=${PYFLEXROOT}/bindings/build:$PYTHONPATH
export LD_LIBRARY_PATH=${PYFLEXROOT}/external/SDL2-2.0.4/lib/x64:$LD_LIBRARY_PATH

A better way to do this is to add all these lines into a .sh file and then run the file. In this repository, this file is called prepapre_gpt.sh

(base) rajeshgayathri2003@cappuccino:~/GPT-Fabric-plus-plus$ . ./prepare_gpt.sh 
(gptfab-folding) rajeshgayathri2003@cappuccino:~/GPT-Fabric-plus-plus$ ```

Pre-requisites

Before we evaluate GPT-Fabric++ folding, we need to have the necessary sub-goal sequences and starting configurations.

To be consistent with prior work, we use the initial evaluation configurations for square and rectangular shaped fabric used by Foldsformer. You can find these in cached configs/
The initial configurations can also be generated by running the following script python generate_configs.py --num_cached 100 --cloth_type square Here num_cached denotes the number of configurations and cloth_type denotes whether the given cloth is of square, rectangle or other shapes.
In addition to the sub-goal sequences used by GPT-Fabric, we also introduce bimanual manipulation. The sub-goal sequences used can be downloaded from here.

Generating Expert Demonstrations

In order to evaluate the folds produced by GPT-Fabric++, we need some system that can produce expert folds. We compare the two results to find the mean particle distance error of the folds achieved. This expert system can be found in the Demonstrator directory and can be run using

python generate_demonstrations.py --gui --task DoubleTriangle --img_size 128 --cached square
python generate_demonstrations.py --gui --task DoubleStraightBimanual --img_size 128 --cached rectangle
python generate_demonstrations.py --gui --task AllCornersInward --img_size 128 --cached square
python generate_demonstrations.py --gui --task CornersEdgesInwardBimanual --img_size 128 --cached square

where --task specifies the task name, --img_size specifies the image size captured by the camera in the simulator, and --cached specifies the filename of the cached configurations. You can remove --gui to run headless. These generated demonstrations will be saved in data/demonstrations.

Note that since the same folding task could be achieved in various ways for the same cloth configuration, we consider all the different possible final cloth configurations corresponding to such a successful cloth fold as per the expert driven heuristic (aka the Demonstrator).

For each cloth configuration, 0.png is the top-down image corresponding to the initial state. {step#}-{fold#}.png is the top-down image corresponding to the given step number {step#}for the given specifgic way of achieving the successful fold represented as {fold#}. The final cloth configuration will be saved as a pickle file given as info-{fold#}.pkl. To compute the mean particle position error (in mm) for evaluation, we consider the distances for all the possible final cloth configurations from the acheived final cloth configuration by GPT-Fabric and take the minimum of those.

Fine-tuning Details

We fine-tune GPT-4o to help the system generate better folding instructions. Here are some useful links that will guide you on fine-tuning GPT-4o

The .jsonl file that we used for fine-tuning in this paper is given in training_set.jsonl and validation_set.jsonl.

In case you wish to create your own jsonl files, download the dataset from here. This dataset is based on the evaluation images used by FabricFlowNet.

Convert the images into base64 encoding using the convert_single_arm.js and convert_bimanual.js files. The output will be saved as unimanual.json and bimanual.json. In this case, the language instruction for each fold needs to be manually editted. Split the dataset in the ratio of 80:20 between training and validation.

You can validate these two files using the script in validate_data.py. Note that the jsonl file must be properly formatted.

Once the dataset is ready, we can proceed to fine-tuning GPT-4o. [Refer to fine_tuning.py] Use the below script to upload your tarining dataset and validation dataset.

response_train = client.files.create(
  file=open("training_set.jsonl", "rb"),
  purpose="fine-tune"
)
print(response_train)

response_validate = client.files.create(
  file=open("validation_set.jsonl", "rb"),
  purpose="fine-tune"
)

print(response_validate)

Once the training and validation datasets have been successfully created, use this script to fine-tune GPT-4o.

ft_job = client.fine_tuning.jobs.create(
  training_file="file-id obtained from response_train",
  model="gpt-4o-2024-08-06",
  hyperparameters = {
      "n_epochs": 2
  },
  validation_file = "file-id obtained from response_validate"
)

The fine-tuning process will take a while (<30 minutes). OpenAI will send an email to your registered email address once the process is complete and your model is ready to use.

GPT-Fabric++ Folding

To reproduce the results produced by GPT-Fabric++ run the following commands.

python eval_finetuned.py --task DoubleStraightBimanual --cached rectangle --save_vid True --total_runs 5 --eval_type zero-shot
python eval_finetuned.py --task DoubleTriangle --cached square --save_vid True --total_runs 5 --eval_type zero-shot
python eval_finetuned.py --task AllCornersInward --cached square --save_vid True --total_runs 5 --eval_type zero-shot
python eval_finetuned.py --task CornersEdgesInwardBimanual --cached square --save_vid True --total_runs 5 --eval_type zero-shot

Acknowledgements

A significant part of this work is based on GPT-Fabric by Raval et. al. Check out their work here!

Contact

For any further queries, feel free to write to Gayathri at [email protected]

Older:

python eval_vanilla.py --task DoubleTriangle --eval_type zero-shot --cached square

python eval_vanilla.py --task DoubleStraight --eval_type zero-shot --cached rectangle

python eval_vanilla.py --task DoubleStraightBimanual --eval_type zero-shot --cached rectangle

python eval_vanilla.py --task AllCornersInward --eval_type zero-shot --cached square

python code_generation_bimanualv2.py --task DoubleTriangle --eval_type zero-shot --cached square --num_exexute 5

python code_generation_bimanualv2.py --task DoubleStraight --eval_type zero-shot --cached rectangle --num_exexute 5

python code_generation_bimanualv2.py --task DoubleStraightBimanual --eval_type zero-shot --cached rectangle --num_exexute 5

python code_generation_bimanualv2.py --task AllCornersInward --eval_type zero-shot --cached square --num_exexute 5

[There are 40 configurations totally. We can run a small scale experiment with just 5]

List of deprectaed files:

gpt_eval.py (Based on FabricFlowNet, has the segmentation fault)
gpt_eval_2.py (Based on FabricFlowNet, has the segmentation fault)
fold_eval.py (From earlier experiments, set-of-mark prompting)
fold_eval_2.py (using the FabricFlowNet method)
code_generation.py , code_generation_2.py , code_generation_bimanual.py older code versions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GPT-Fabric++: Improved Bimanual Fabric Manipulation via Pre-trained foundation models

Gayathri Rajesh, Vedant Raval, Enyu Zhao, Daniel Seita

Table Of Contents

Installation

Pre-requisites

Generating Expert Demonstrations

Fine-tuning Details

GPT-Fabric++ Folding

Acknowledgements

Contact

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
Demonstrator		Demonstrator
DoubleSTraight_RUn1_with20configs		DoubleSTraight_RUn1_with20configs
GPT-Fabric-PP-RL		GPT-Fabric-PP-RL
PyFlex		PyFlex
cached configs		cached configs
data		data
fine_tuning		fine_tuning
softgym		softgym
utils		utils
.gitignore		.gitignore
GPT-API-Key.txt		GPT-API-Key.txt
README.md		README.md
cachedconfigs		cachedconfigs
compile_1.0.sh		compile_1.0.sh
environment.yml		environment.yml
eval.py		eval.py
eval_finetuned.py		eval_finetuned.py
eval_vanilla.py		eval_vanilla.py
generate_configs.py		generate_configs.py
generate_demonstrations.py		generate_demonstrations.py
generate_gpt_demonstrations.py		generate_gpt_demonstrations.py
key.txt		key.txt
prepare_1.0.sh		prepare_1.0.sh
prepare_gpt.sh		prepare_gpt.sh
reading_logs.py		reading_logs.py
real_world_experiments_utils.py		real_world_experiments_utils.py
slurm_utils.py		slurm_utils.py

slurm-lab-usc/GPT-Fabric-plus-plus

Folders and files

Latest commit

History

Repository files navigation

GPT-Fabric++: Improved Bimanual Fabric Manipulation via Pre-trained foundation models

Gayathri Rajesh, Vedant Raval*, Enyu Zhao*, Daniel Seita

Table Of Contents

Installation

Pre-requisites

Generating Expert Demonstrations

Fine-tuning Details

GPT-Fabric++ Folding

Acknowledgements

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Gayathri Rajesh, Vedant Raval, Enyu Zhao, Daniel Seita

Packages