This project is an implementation of the protovoice model as described in:
Finkensiep, Christoph, and Martin Rohrmeier. 2021. “Modeling and Inferring Proto-Voice Structure in Free Polyphony.”
In Proceedings of the 22nd International Society for Music Information Retreival Conference, 189–96. Online.
It provides types and functions for representing and working with protovoice derivations, various parser implementations, plotting functionality, and a probabilistic model.
Have a look at the documentation
This project is split into two packages
protovoicesfor the main model codeprotovoices-rlfor a parser that is trained through reinforcement learning
The library part of the project (which implements the model) contains two types of modules,
generic modules, which work on generic "path grammars"
(with the common "outer operations" split, spread, and freeze)
and modules that are specific to the protovoice grammar.
- Common (docs) Common types and functions for representing and working with generic "path grammars" and their derivations.
- Display (docs) Generic code for plotting path grammar derivations.
- ChartParser
(docs)
A semiring chart parser that exhaustively parses a path grammar.
- Scoring.FunTyped (docs) A representation of semiring scores with "holes" based on closures (used by the chart parser).
- Scoring.Deprecated.Flat (docs) Partial semiring scores based on lists (previously used by the chart parser).
- GreedyParser (docs) A greedy parser that tries to find a single parse for a path grammar by choosing the next reduction step according to a policy (e.g., randomly).
- PVGrammar
(docs)
Representing protovoice derivations and their operations.
- PVGrammar.Parse (docs) The parsing direction of the PV grammar.
- PVGrammar.Generate (docs) The generative direction of the PV grammar.
- PVGrammar.Prob.Simple (docs) A probabilistic model of the PV grammar.
- MainISMIR Code for the paper linked above.
- MainLearning Code for a Bayesian inference experiment (part of my dissertation).
- MainExamples Generates or demonstrates various examples.
- MainParsing Testbed for parsing.
- RL
A reinforcement-learning agent for parsing pieces (uses
GreedyParser). Examples of usage can be found inMainRLChords(see below).- RL.ModelTypes: type-level model parameters (e.g. tensor sizes and device).
- RL.Encoding: translating protovoice datastructures into HaskTorch tensors.
- RL.Model: the NN model used by the RL agents.
- RL.Callbacks: reward and scheduling functions.
- RL.DQN: implementation of a Deep Q-Learning agent.
- RL.ReplayBuffer: used by
RL.DQN.
- RL.ReplayBuffer: used by
- RL.A2C: implementation of an Advantage Actor Critic agent.
- RL.A2CHelpers:
HaskTorch helper functions used by
RL.A2C.
- RL.A2CHelpers:
HaskTorch helper functions used by
- RL.Plotting: plotting histories.
- MainRLChords.hs: train the model on chords.
You can build the main project with stack using:
$ stack build protovoicesRun any of the executables using:
$ stack exec {ismir2021,learn,examples,parse}To build the documentation, run:
$ stack haddockThe RL parser uses hasktorch, which requires libtorch as a dependency. You can get it like this:
$ cd deps
$ ./get-deps.sh {cpu,cuda,rocm}To build the project, you'll need to add the path to libtorch (in deps/libtorch/) to the linker path,
which you can do by sourcing deps/setenv:
$ . deps/setenv
$ stack buildIf you prefer using Nix, you might find a solution in the hasktorch repo.