Skip to content

Update new language doc #126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

deonajulary06
Copy link
Collaborator

Updated document with additional instructions and fixed errors

Comment on lines +13 to +17
നിനക്ക്,second,singular,dative,dependency=nonhonorific
നീ,second,singular,nominative,dependency=nonhonorific
നിനെ,second,singular,accusative,dependency=nonhonorific
നിന്റെ,second,singular,genitive,dependency=dependent,dependency=nonhonorific
നിന്റേതു്,second,singular,genitive,dependency=independent,dependency=nonhonorific
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing. Typically gender, number, or animacy would be used for dependency. The dependency is typically being used for the word being used in combination to the pronoun. For example, if I were to say "mio" or "mia" in a language, and the gender depended on the gender of the object being possessed instead of the gender of the person being referenced, then I'd use dependency.

Comment on lines +2 to +15
* Copyright 2025 Apple Inc. All rights reserved.
*/
#include <inflection/dialog/language/MlCommonConceptFactory.hpp>

namespace inflection::dialog::language {

MlCommonConceptFactory::MlCommonConceptFactory(const ::inflection::util::ULocale& language)
: super(language)
{
}

MlCommonConceptFactory::~MlCommonConceptFactory()
{
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all of the new files, please make sure that the copyright is not Apple, and it's not Unicode.

I'm assuming that this class is trying to emulate the English behavior, where a quantity changes by only it being singular or plural for the noun in a quantity. For example, 1 man, 2 men, 1 woman, 2 women, and so forth.

Are such quantities affected by grammatical case? If so, then this class likely needs a little more customization for the quantify method. For Slavic languages, the rules can get fairly complicated, and Malayalam seems to have more grammatical cases than Russian. So I'm wondering how this would work. I also see that the number pronunciation doesn't vary like many other European languages. So Malayalam language may be simpler to support.

if (displayString.ends_with(u"ഉടെ") || // uṭe
displayString.ends_with(u"യുടെ") || // yude (my, your, his, her...)
displayString.ends_with(u"ന്റെ") || // ente (mine), avante, etc.
displayString.ends_with(u"ആയുടെ")) // āyuṭe (fem. 3rd person possessive)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this logic correct? Are these all suffixes? Should the displayString be longer than the string being matched? If so, perhaps enumerating these strings in a loop and doing a length comparison would be more accurate.

MlGrammarSynthesizer_CountLookupFunction::MlGrammarSynthesizer_CountLookupFunction()
: super(::inflection::util::LocaleUtils::MALAYALAM(),
{GrammemeConstants::NUMBER_SINGULAR(), GrammemeConstants::NUMBER_PLURAL()},
{GrammemeConstants::POS_NOUN(), GrammemeConstants::POS_VERB()})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do adjectives inflect in Malayalam?

Also it's probably better to rename this to MlGrammarSynthesizer_NumberLookupFunction. That would be more accurate. Unfortunately some of the classes use the term Count, but it's more accurate to say Number. There is a reason for this historical inaccuracy in this code that I don't need to elaborate here.

Comment on lines +22 to +34
, caseFeature(*npc(model.getFeature(GrammemeConstants::CASE)))
, definitenessFeature(*npc(model.getFeature(GrammemeConstants::DEFINITENESS)))
, adjectivalFeature(*npc(model.getFeature(ADJECTIVAL)))
, particleMap({
{GrammemeConstants::CASE_ACCUSATIVE(), {MlGrammarSynthesizer::FINAL_VOWELS(), u"യെ", u"നെ"}},
{GrammemeConstants::CASE_DATIVE(), {MlGrammarSynthesizer::FINAL_VOWELS(), u"ക്ക്", u"നു"}},
{GrammemeConstants::CASE_GENITIVE(), {MlGrammarSynthesizer::EMPTY_SET(), u"", u"ന്റെ"}},
{GrammemeConstants::CASE_LOCATIVE(), {MlGrammarSynthesizer::FINAL_VOWELS(), u"യിൽ", u"ൽ"}},
{GrammemeConstants::CASE_ABLATIVE(), {MlGrammarSynthesizer::FINAL_VOWELS(), u"ഇൽനിന്ന്", u"ൽനിന്ന്"}},
{GrammemeConstants::CASE_INSTRUMENTAL(), {MlGrammarSynthesizer::FINAL_VOWELS(), u"ഉപയോഗിച്ച്", u"കൊണ്ട്"}},
{GrammemeConstants::CASE_VOCATIVE(), {MlGrammarSynthesizer::FINAL_VOWELS(), u"ആ", u"േ"}},
{PREDICATIVE, {MlGrammarSynthesizer::FINAL_VOWELS(), u"ആണ്", u"ഇാണ്"}}
}) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is surprising. It looks like you took some inspiration from Korean for the particle map. Is Malayalam that simple and reproducible? I'd figure that it would have some way that mixes grammatical gender and grammatical number in with the case.

I'm a little surprised by the usage of DEFINITENESS, but grammar.xml doesn't define that. I'd expect that the test code would crash here during construction. I recommend reviewing this DEFINITENESS usage.

Comment on lines +15 to +23
<!-- Malayalam Singular Plural Inflection Tests -->
<test><source number="plural">കുട്ടി</source><result>കുട്ടികൾ</result></test>
<test><source number="singular">പുസ്തകങ്ങൾ</source><result>പുസ്തകം</result></test>
<test><source number="plural">മരം</source><result>മരങ്ങൾ</result></test>
<test><source number="singular">കഥകൾ</source><result>കഥ</result></test>

<!-- Malayalam Gender Inflection Tests (for adjectives) -->
<test><source gender="feminine">നല്ല</source><result exists="true">നല്ല</result></test>
<test><source gender="masculine">നല്ല</source><result exists="true">നല്ല</result></test>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like MlGrammarSynthesizer_MlDisplayFunction doesn't support these operations yet.


#include <inflection/dialog/DefaultDisplayFunction.hpp>
#include <inflection/grammar/synthesis/fwd.hpp>
#include <inflection/grammar/synthesis/MlGrammarSynthesizer_ParticleResolver.hpp>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compilation fails in this pull request due to this missing header. Since it's being referenced in this class, perhaps you forgot to include it in your commits.

@@ -29,13 +31,13 @@ TODO: We need to expand what each of these do.
* ADD: inflection/src/inflection/grammar/synthesis/*Xx*GrammarSynthesizer.hpp
* ADD: inflection/src/inflection/grammar/synthesis/*Xx*GrammarSynthesizer.cpp
* ADD: inflection/src/inflection/grammar/synthesis/*Xx*GrammarSynthesizer_*Xx*DisplayFunction.hpp
* ADD: inflection/src/inflection/grammar/synthesis/*Xx*GrammarSynthesizer_*Xx*DisplayFunction.hpp
* ADD: inflection/src/inflection/grammar/synthesis/*Xx*GrammarSynthesizer_*Xx*DisplayFunction.cpp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this documentation. This is a good improvement.

@@ -11,6 +11,8 @@ NOTE: Take a look at [PR #40](https://github.com/unicode-org/inflection/pull/40)
In general, to bootstrap your progress look for grammatically similar language that's already supported, e.g. if you are adding Serbian look for existing Russian implementation.
This will help you find most of the files you need to add/change and will speed up implementation of the rules and lexicons.

Before you add new language support, go to the README.md in the inflection subfolder (inflection/inflection/README.md), build the project, and make sure all the tests run on your computer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion!

#include <vector>
#include <map>

class inflection::grammar::synthesis::MlGrammarSynthesizer_MlDisplayFunction : public virtual ::inflection::dialog::DefaultDisplayFunction {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend looking at EnGrammarSynthesizer_EnDisplayFunction, EsGrammarSynthesizer_EsDisplayFunction or DeGrammarSynthesizer_DeDisplayFunction for how to use DictionaryLookupInflector/dictionaryInflector. The German one uses grammatical cases, which will need to be extended to what Malayalam uses. The English and Spanish versions also include some examples of how to guess the singular and plural forms of nouns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants