Visual Effects
-- Using nerfies you can create fun visual effects. This Dolly zoom effect - would be impossible without nerfies since it would require going through a wall. -
- -diff --git a/index.html b/index.html index 373119fe36..9e4c02d5c6 100644 --- a/index.html +++ b/index.html @@ -1,27 +1,15 @@ +
- + content="Trustworthy Machine Reasoning with Foundation Models (tutorial at ACL 2025)."> + -- We present the first method capable of photorealistically reconstructing a non-rigidly - deforming scene using photos/videos captured casually from mobile phones. + Progress in natural language processing has historically been driven by better data, and researchers today are increasingly using ‘synthetic data’ - data generated with the assistance of large language models - to make dataset construction faster and cheaper.
- Our approach augments neural radiance fields - (NeRF) by optimizing an - additional continuous volumetric deformation field that warps each observed point into a - canonical 5D NeRF. - We observe that these NeRF-like deformation fields are prone to local minima, and - propose a coarse-to-fine optimization method for coordinate-based models that allows for - more robust optimization. - By adapting principles from geometry processing and physical simulation to NeRF-like - models, we propose an elastic regularization of the deformation field that further - improves robustness. + However, most synthetic data generation approaches are executed in an ad hoc manner and ‘reinvent the wheel’ rather than build on prior foundations. This tutorial seeks to build a shared understanding of recent progress in synthetic data generation from NLP and related fields by grouping and describing major methods, applications, and open problems.
- We show that Nerfies can turn casually captured selfie - photos/videos into deformable NeRF - models that allow for photorealistic renderings of the subject from arbitrary - viewpoints, which we dub "nerfies". We evaluate our method by collecting data - using a - rig with two mobile phones that take time-synchronized photos, yielding train/validation - images of the same pose at different viewpoints. We show that our method faithfully - reconstructs non-rigidly deforming scenes and reproduces unseen views with high - fidelity. + Our tutorial will be divided into four main sections. First, we will describe algorithms for producing high-quality synthetic data. Second, we will describe how synthetic data can be used to advance the general-purpose development and study of language models. Third, we will demonstrate how to customize synthetic data generation to support scenario-specific applications. Finally, we will discuss open questions about the production and use of synthetic data that must be answered to overcome some of their current limitations. Our goal is that by unifying recent advances in this emerging research direction, we can build foundations upon which the community can improve the rigor, understanding, and effectiveness of synthetic data moving forward. +
July 27, 2025 - Hall B
+- Using nerfies you can create fun visual effects. This Dolly zoom effect - would be impossible without nerfies since it would require going through a wall. -
- -- As a byproduct of our method, we can also solve the matting problem by ignoring - samples that fall outside of a bounding box during rendering. -
- -- We can also animate the scene by interpolating the deformation latent codes of two input - frames. Use the slider here to linearly interpolate between the left frame and the right - frame. + The papers referenced in the tutorial can be found below.
-
- Start Frame
-
- End Frame
-
+ Distilling the Knowledge in a Neural Network
+ Hinton et al., 2015
+
+ Improving Neural Machine Translation Models with Monolingual Data
+ Sennrich et al., 2016
+
+ Sequence-Level Knowledge Distillation
+ Kim & Rush, 2016
+
+ Scalable agent alignment via reward modeling: a research direction
+ Leike et al., 2018
+
+ Are Pretrained Language Models Symbolic Reasoners Over Knowledge?
+ Kassner et al., 2020
+
+ Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
+ Gururangan et al., 2020
+
+ Generating Datasets with Pretrained Language Models
+ Schick & Schütze, 2021
+
+ SynthBio: A Case Study in Faster Curation of Text Datasets
+ Yuan et al., 2021
+
+ Constitutional AI: Harmlessness from AI Feedback
+ Bai et al., 2022
+
+ Red Teaming Language Models with Language Models
+ Perez et al., 2022
+
+ Self-Instruct: Aligning Language Models with Self-Generated Instructions
+ Wang et al., 2022
+
+ STaR: Bootstrapping Reasoning With Reasoning
+ Zelikman et al., 2022
+
+ Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
+ Honovich et al., 2022
+
+ WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation
+ Liu et al., 2022
+
+ Alpaca: A Strong, Replicable Instruction-Following Model
+ Taori et al., 2023
+
+ LongForm: Effective Instruction Tuning with Reverse Instructions
+ Köksal et al., 2023
+
+ Orca: Progressive Learning from Complex Explanation Traces of GPT-4
+ Mukherjee et al., 2023
+
+ Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
+ Kim et al., 2023
+
+ Prompt2Model: Generating Deployable Models from Natural Language Instructions
+ Viswanathan et al., 2023
+
+ RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
+ Dong et al., 2023
+
+ Self-Alignment with Instruction Backtranslation
+ Li et al., 2023
+
+ Self-Refine: Iterative Refinement with Self-Feedback
+ Madaan et al., 2023
+
+ SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization
+ Kim et al., 2023
+
+ The False Promise of Imitating Proprietary LLMs
+ Gudibande et al., 2023
+
+ Textbooks Are All You Need
+ Gunasekar et al., 2023
+
+ UltraFeedback: Boosting Language Models with Scaled AI Feedback
+ Cui et al., 2023
+
+ WizardLM: Empowering Large Pre-trained Language Models to Follow Complex Instructions
+ Xu et al., 2023
+
+ AI models collapse when trained on recursively generated data
+ Shumailov et al., 2024
+
+ Better Synthetic Data by Retrieving and Transforming Existing Datasets
+ Gandhi et al., 2024
+
+ Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
+ Hu et al., 2024
+
+ Checklists Are Better Than Reward Models For Aligning Language Models
+ Viswanathan et al., 2024
+
+ Evaluating Language Models as Synthetic Data Generators
+ Kim et al., 2024
+
+ Evaluating Reward Models for Language Modeling
+ Lambert et al., 2024
+
+ Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback
+ Miranda et al., 2024
+
+ MAmmoTH2: Scaling Instructions from the Web
+ Yue et al., 2024
+
+ Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
+ Maini et al., 2024
+
+ SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning
+ Zhao et al., 2024
+
+ Synthetic continued pretraining
+ Yang et al., 2024
+
+ Synthetic Dataset Generation Through Corpus Retrieval and Augmentation
+ Ziegler et al., 2024
+
+ The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Language Models
+ Chen et al., 2024
+
+ All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning
+ Swamy et al., 2025
+
+ DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL
+ Deepseek-AI, 2025
+
+ Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning
+ Jung et al., 2025
+
+ SynthTextEval: Synthetic Text Data Generation and Evaluation for High-Stakes Domains
+ Ramesh et al., 2025
+
+ What Makes a Reward Model a Good Teacher? An Optimization Perspective
+ Razin et al., 2025
+
- Using Nerfies, you can re-render a video from a novel - viewpoint such as a stabilized camera by playing back the training deformations. -
- There's a lot of excellent work that was introduced around the same time as ours. -
-- Progressive Encoding for Neural Optimization introduces an idea similar to our windowed position encoding for coarse-to-fine optimization. -
-- D-NeRF and NR-NeRF - both use deformation fields to model non-rigid scenes. -
-- Some works model videos with a NeRF by directly modulating the density, such as Video-NeRF, NSFF, and DyNeRF -
-- There are probably many more by the time you are reading this. Check out Frank Dellart's survey on recent NeRF papers, and Yen-Chen Lin's curated list of NeRF papers. -
-@article{park2021nerfies,
- author = {Park, Keunhong and Sinha, Utkarsh and Barron, Jonathan T. and Bouaziz, Sofien and Goldman, Dan B and Seitz, Steven M. and Martin-Brualla, Ricardo},
- title = {Nerfies: Deformable Neural Radiance Fields},
- journal = {ICCV},
- year = {2021},
+ @inproceedings{synth-data-tutorial,
+ title = "Synthetic Data in the Era of Large Language Models",
+ author = "Viswanathan, Vijay and
+ Yue, Xiang and
+ Liu, Alisa and
+ Wang, Yizhong and
+ Neubig, Graham",
+ booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 5: Tutorial Abstracts)",
+ publisher = "Association for Computational Linguistics",
}
- This website is licensed under a Creative - Commons Attribution-ShareAlike 4.0 International License. -
-- This means you are free to borrow the source code of this website, - we just ask that you link back to this page in the footer. - Please remember to remove the analytics code included in the header of the website which - you do not want on your website. + + This website template is borrowed from Nerfies.