Wiktionary talk:About Polish
Not addressed yet
[edit]- Semelfactive and iterative verbs, e.g. krzyknąć pf vs pokrzyczeć pf and jadać impf vs jeść impf.
{{pl-verb}}
supports values of|a=
which mark this sub-aspect classification.
- Verbs taking objects in cases other than accusative, and verbs taking multiple objects; pożyczyć means either "to borrow" or "to lend" depending on the grammatical case of the object.
- Reflexive verbs: whether to put them under an entry title with się or without it.
- Probably will go for the former.
- Treatment of adjectives which function as nouns (ciężarna) and of adjectival surnames (Kowalski).
How to present negative deverbal derivations (chodzić → chód, wodzić → -wód, rzucać → rzut){{back-form}}
{{deverbal}}
- Treatment of participles — whether they are adjectives and adverbs, a separate part of speech or verb forms.
- User:Tweenk made it a separate part of speech.
- Whether to make specifying animacy mandatory for nouns, like it is for Russian. Animacy is relevant to declension, especially of adjectives.
- Obsolete singular forms, like bliźnię or powidło.
- …
Keφr 18:09, 21 December 2013 (UTC)
Derived/Related terms and also Impersonal verbs only in the 3rd
[edit]@BigDom, Shumkichi, KamiruPL, Hythonia So I was talking on the wiktionary discord. I was wrong, we shouldn't be using {l} node for derived and related terms, and also apparently we should be using {der3} or something similar. Now, I don't know about y'all but I find that table to be really, really aesthetically displeasing. We should settle on a format. We could list multiple {l} nodes like we were doing before. Sorry for the confusion. I misunderstood something told to me. A second note, we should discuss a way to save verb conjugations that exist only in the third person. Someone suggested using the verb header template, which I really like. Can that template take slots for past/future forms etc? Vininn126 (talk) 14:57, 12 November 2021 (UTC)
- Actually, wait.
- https://en.wiktionary.org/wiki/Wiktionary:Grease_pit/2021/November#On_wrapping_entire_definitions_in_%7B%7Bl%7Cen%7D%7D Vininn126 (talk) 16:19, 12 November 2021 (UTC)
- @Vininn126 I think it's just a matter of one's taste but some consistency would be nice. I'm fine with any decision you'll make; if it can help save space, then why not. Shumkichi (talk) 19:23, 12 November 2021 (UTC)
Inflected Conjunctions and the Like
[edit]@BigDom, @Hythonia, @KamiruPL @Luxtaythe2nd added bośmy today, which has brought my attention to the fact that we probably need a better way of handling these conjunctions with the particles attached. gdyby has a small table on it, but there's nothing systematic. I think these forms should not be considered lemmas, but rather an inflected form. We may also have to update the Module:category tree. Vininn126 (talk) 16:28, 6 December 2021 (UTC)
- Resolved, Thadh made
{{pl-combined form}}
for us. Vininn126 (talk) 13:14, 28 January 2022 (UTC)
Spelling reforms
[edit](Notifying Hergilei, Tweenk, Shumkichi, Wrzodek, KamiruPL, BigDom, Hythonia, Tashi): I would like to create a system wherein we can denote the alternative forms that existed before certain spelling reforms. I'd like to create a standard verbiage to put into {{alt}}
, something like you can see on пёс#Russian with ro-PRO, but we would need more probably. I'd also like to create categorizing templates a la {{ru-pre-reform}}
. Does anyone have a comprehensive list of spelling reforms in Polish? I believe we might need at least two, one for Linde's time and his forms (e.g. abdykacya) and then one before the 1936 reform (e.g. in Słownik Wileński and Warszawski, e.g. aljans). Middle Polish forms are probably too unpredictable, but we could probably create something like {{Middle Polish form}}
, as of now I'm using a label and {{obs form}}
.
The only thing I would need help with is the naming - I think we can model the Russian verbiage; I just need a list of relevant reforms.
There's also the matter of how much information to include - I think we should make these as simple as possible - no etymologies for example. We should avoid pronunciation sections in Middle Polish forms, what about the others? For Middle Polish only attested declined forms should be given, but what about the others? Vininn126 (talk) 09:14, 8 December 2022 (UTC)
- Here's the link to Polański's Reformy ortografii polskiej – wczoraj, dziś, jutro which might be helpful. Tashi (talk) 09:53, 8 December 2022 (UTC)
- So it seems that the first major reform was in 1827 after Linde's dictionary where [y] to represent /j/ was deprecated, the removal of á, and a few other changes. I believe forms found from before this time can have a template like pl-pre-1827 for
{{alt}}
. Forms after this time are found in Słownik wileński - this can be pl-pre-1877, as after that reform we find the forms found in Słownik warszawski. There seems to have been another reform before modern norms were adopted, but it seems it didn't take, so we might only need these templates - Middle Polish form, pre 1827 (Linde-like), wileński-like, and warszawski-like. Vininn126 (talk) 12:10, 8 December 2022 (UTC)- Okay, I've done more research. basing off of polszczyzna, and the PDF, it seems there were four major changes. I propose we make
{{pl-pre-1814}}
,{{pl-pre-1906}}
,{{pl-pre-1918}}
,{{pl-pre-1936}}
. These templates will only be used for rules within those changes (i.e. using -ja instead of -ya or -ia, like in Linde or SWil), and only attested forms. This will not affect other alternative forms. Vininn126 (talk) 22:18, 13 December 2022 (UTC)- Also I realize we should probably add a parameter to
{{pl-decl-adj-auto}}
that adds the -émi forms and its like, like what we do with old_dative. Vininn126 (talk) 22:32, 13 December 2022 (UTC)
- Also I realize we should probably add a parameter to
- Okay, I've done more research. basing off of polszczyzna, and the PDF, it seems there were four major changes. I propose we make
- So it seems that the first major reform was in 1827 after Linde's dictionary where [y] to represent /j/ was deprecated, the removal of á, and a few other changes. I believe forms found from before this time can have a template like pl-pre-1827 for
- @Tashi @Hythonia@Shumkichi@BigDom I have updated History of Polish orthography, and I'm beginning to wonder if we should only have one template, perhaps
{{pl-pre-1936}}
, considering it was the most effective one. So the alternative spellings of abiuracja would use this template. Thoughts? Vininn126 (talk) 13:00, 27 January 2023 (UTC)- Also @Sławobóg Vininn126 (talk) 13:01, 27 January 2023 (UTC)
- Actually... Perhaps just
{{obsolete spelling of}}
would be best Vininn126 (talk) 14:49, 27 January 2023 (UTC)- These reforms are all post-Middle-Polish so obsolete is fine. Sławobóg (talk) 16:04, 27 January 2023 (UTC)
- Well I have gone ahead and made changes to abiuracja and made abjuracya and abjuracja. I will gladly listen to any input. Vininn126 (talk) 13:54, 31 January 2023 (UTC)
Silesian as well as Old/Middle/Modern Polish
[edit]@Hythonia @BigDom @KamiruPL (as you three seem to know the most about this group) I've been thinking a lot about how these lectsare currently handled/grouped. I recently updated w:Middle Polish, and I've been wondering in my head if we shouldn't split Middle Polish off as an L2 as opposed to a label. Reasons for doing this include the fact that Middle Polish did have some fairly significant sound and grammar changes, and also we currently treat Silesian as an L2, which would give Middle Polish two descendants.
There are many ways to group it, but splitting Middle Polish off is something I've been thinking about a lot. I'm not sure if I'm for or against it, and I would really like some input. Vininn126 (talk) 20:37, 13 April 2023 (UTC)
- Other thoughts:
- One reason to NOT split Middle Polish would be that it could be too semantically similar to Polish, creating a lot of unwanted duplication. I'm not sure if it's dissimilar enough, but sometimes I do get that impression.
- If we decide we shouldn't split it off as an L2, would there be merit in having it as an ety-only language, and then having Silesian be a descendant of Polish? There are strong political implications to this, and I'm not sure I'm a fan. Vininn126 (talk) 20:41, 13 April 2023 (UTC)
Ety and col-auto cleanup project
[edit](Notifying KamiruPL, BigDom, Hythonia, Tashi): and @Benwing2 I want to do a huge etyclean up project.
- Find any pages missing etymologies entirely and tag them with
{{rfe}}
for easier tracking - Find any pages using only
{{m}}
and tag them with{{etystub}}
and use an appropriate template. - Change any instances of
, a {{calque|pl|FOO|BAR|nocap=1}}
and any variants there of to. {{cal|pl|FOO|BAR}}
For #1 and #2 there would probably be a ton of them, it's something we would chip away at.
Are there any other etymology changes we should make? I'd like to add Old Polish and dates to everything, but for now I think this would be a huge step forward. I considered having a bot tag any borrowings with {{etystub}}
so we could get sources on them, I'm not sure if that'd be a step too far.
For Derived/related terms, I am wondering if a bot could check any pages with instances of {{col-auto}}
without a title and separate the terms into difference instances of {{col-auto}}
by part of speech, alphabetized by part of speech. Vininn126 (talk) 13:03, 7 September 2023 (UTC)
- Support! We're gonna have tons of lack of etymology but at least we're gonna have them in one place. As of {[temp|col-auto}}, are there many pages like that? I can't recall many examples. Is it also possible for a bot to go through the most popular users who record audio files so we could compare the pages they've recorded and those we've got and add missing audio files Tashi (talk) 18:02, 7 September 2023 (UTC)
- @Tashi, to your first question, see User:Vininn126/Miscellaneous. For your second point, there was a bot at some point, not sure when it was last run. Vininn126 (talk) 18:03, 7 September 2023 (UTC)
- Point #2 looks reasonable, I don't feel strongly about #3. For #1 I'd note that some pages, like abbreviations and acronyms, don't really need etymology sections. (What etymology would kpk have, for example? The "etymology" is already in the definition.)
- Regarding the derived/related terms change, I'd also maybe change all instances of
{{col-auto|pl|title=noun}}
to{{col-auto|pl|title=nouns}}
, and so on? I've been adding the titles like in the former example, because I've seen others add it like that, but thinking about it now that doesn't really make sense: the section can be expanded anyway, it doesn't need to be singular for tables that only have one derived term in the part of speech (we already call it "Derived terms," not "Derived term"), and it only gives editor more things to pay attention to for no real benefit. Hythonia (talk) 13:11, 8 September 2023 (UTC)- These are all fair points - especially about the acronyms, we can skip them. I also prefer to have plurals in the titles, another good point. Vininn126 (talk) 14:20, 8 September 2023 (UTC)
- Also, I feel Polish Latin/Ancient Greek borrowings should all be converted to Learned borrowings. Vininn126 (talk) 15:29, 8 September 2023 (UTC)
- @Benwing2 Would you be willing to help with this? I think these would be my last requests. Vininn126 (talk) 21:42, 8 September 2023 (UTC)
- @Vininn126 It depends on what the requests are and how easy they are to carry out. For example, of the 3 requests you made recently, #3 was easy, #2 ended up easy but I had to ask some clarifying questions, and for #1 I had to ask several clarifying questions and the actual task is not so easy; that's why I haven't done it yet. For bot requests you need to be as specific and detailed as possible as to what you want carried out; ideally I shouldn't have to ask a lot of clarifying questions to figure out the task. For bot requests that were previously done by User:JeffDoozan you might ask them to repeat them again since they already have the scripts written and at hand. Benwing2 (talk) 22:17, 8 September 2023 (UTC)
- @Benwing2 So here are my requests:
- I think I will wait on the titles for col-auto. Vininn126 (talk) 22:24, 8 September 2023 (UTC)
- @Vininn126 OK those don't seem so hard, I will take a look. Benwing2 (talk) 22:27, 8 September 2023 (UTC)
- @Benwing2 Thanks for all the help lately. Like I said, I think this would be everything that's been on my mind and I appreciate it. Vininn126 (talk) 22:27, 8 September 2023 (UTC)
- @Benwing2 Do you think you could run that script for adding
{{rfe}}
and{{etystub}}
soon? Vininn126 (talk) 14:23, 12 September 2023 (UTC)- @Vininn126 Sure. My only concern would be that this would add a fair amount of noise to the entries but if you're planning on fixing them up soon afterwards it makes sense. Benwing2 (talk) 20:29, 12 September 2023 (UTC)
- @Benwing2 Do you think you could run that script for adding
- @Benwing2 Thanks for all the help lately. Like I said, I think this would be everything that's been on my mind and I appreciate it. Vininn126 (talk) 22:27, 8 September 2023 (UTC)
- @Vininn126 OK those don't seem so hard, I will take a look. Benwing2 (talk) 22:27, 8 September 2023 (UTC)
- @Vininn126 It depends on what the requests are and how easy they are to carry out. For example, of the 3 requests you made recently, #3 was easy, #2 ended up easy but I had to ask some clarifying questions, and for #1 I had to ask several clarifying questions and the actual task is not so easy; that's why I haven't done it yet. For bot requests you need to be as specific and detailed as possible as to what you want carried out; ideally I shouldn't have to ask a lot of clarifying questions to figure out the task. For bot requests that were previously done by User:JeffDoozan you might ask them to repeat them again since they already have the scripts written and at hand. Benwing2 (talk) 22:17, 8 September 2023 (UTC)
- @Benwing2 Would you be willing to help with this? I think these would be my last requests. Vininn126 (talk) 21:42, 8 September 2023 (UTC)
- Also, I feel Polish Latin/Ancient Greek borrowings should all be converted to Learned borrowings. Vininn126 (talk) 15:29, 8 September 2023 (UTC)
- These are all fair points - especially about the acronyms, we can skip them. I also prefer to have plurals in the titles, another good point. Vininn126 (talk) 14:20, 8 September 2023 (UTC)
Declension Modules
[edit]@Benwing2 Let me know when you're ready to start/what questions you might have/what you'd like to start with. Vininn126 (talk) 08:27, 9 February 2024 (UTC)
- @Vininn126 OK sure. Benwing2 (talk) 10:25, 9 February 2024 (UTC)
- @Vininn126 Can you point me to any online grammars of Polish, any Polish equivalents of Sloworz (i.e. Polish dictionaries with inflection tables) and any other Polish dictionaries that have inflection info in them, even if it's partial? Benwing2 (talk) 08:59, 10 February 2024 (UTC)
- @Benwing2 WSJP is definitely going to be your best bet - they have inflections for all inflectional words. I also have a few dictionaries with patterns in them. The Wikipedia article is okay, but I think that one book is going to have the most patterns. Vininn126 (talk) 09:08, 10 February 2024 (UTC)
- @Vininn126 Thanks. What is the URL for WSJP? Also any other URL's for other online resources would be great even if they aren't necessarily as good; it's good to be able to cross-check things like this. Benwing2 (talk) 09:19, 10 February 2024 (UTC)
- Also any URL's for online grammars, even if they're in Polish. Benwing2 (talk) 09:20, 10 February 2024 (UTC)
- @Benwing2 WSJP and I just realized (the incredibly slow) SGJP is going to be very useful. Doroszewski has some patterns. this would be useful. Otherwise most resources are behind a paywall. Vininn126 (talk) 09:27, 10 February 2024 (UTC)
- @Benwing2 this is also a pretty good one, to be honest, even though the design is straight from the early Internet (that's how you know it's good). Vininn126 (talk) 10:15, 10 February 2024 (UTC)
- @Vininn126 Gack, that design is awful, with panes blocking other panes, flashing marquee stuff, etc. Thanks for all the links. Benwing2 (talk) 21:40, 10 February 2024 (UTC)
- @Benwing2 Yeah, it's bad. But once you get to the grammar section it's much better, and very, very thorough. Vininn126 (talk) 21:40, 10 February 2024 (UTC)
- @Vininn126 I implemented a Polish adjective module using the same syntax as the Kashubian adjective module (which is also essentially the same syntax used by the Czech, Ukrainian, Belarusian, etc. modules). See User:Benwing2/test-pl-adecl. Benwing2 (talk) 05:23, 11 February 2024 (UTC)
- @Benwing2 Dang your fast! I see you have passive participles as taking ablaut by default, but zielony/czerwony (and their compounds) without; I think zielony should have the -e-/-o- ablaut alongside the ablautless version. Otherwise I see you have all the vowel/consonant alternations that take place in the virile, and the rest should be regular and easy. Overall Polish has much fewer vowel alternations than Kashubian or even other Lechitic ethnolects, so this should be easier. Vininn126 (talk) 08:56, 11 February 2024 (UTC)
- @Vininn126 You're saying we should generate two forms for nom virile pl for zielony and compounds, one with -oni and one with -eni? What about czerwony and compounds, same? What about słony? BTW what is missing? It should be possible to do jeden just using
<decllemma:jedny>
, I think. Overrides including for short forms are already present. Benwing2 (talk) 10:48, 11 February 2024 (UTC)- @Benwing2 zielony should have both, czerwony should not, and słony/słoni. I think we still need cases for ten/tamten, as I mentioned for Kashubian. Vininn126 (talk) 10:54, 11 February 2024 (UTC)
- @Vininn126 For ten/tamten, try
<decllemma:ty>
and<decllemma:tamty>
respectively. I haven't implemented<decllemma:...>
for Kashubian adjectives but it should be easy to do. Benwing2 (talk) 11:26, 11 February 2024 (UTC)- @Benwing2 Decided to add them to the test cases and it indeed works - I assume the nominative singular/accusative singular will be supplied by the pagename? Vininn126 (talk) 11:28, 11 February 2024 (UTC)
- @Benwing2 One thing - it should be tamto/to for neuter singular. Vininn126 (talk) 13:54, 11 February 2024 (UTC)
- @Vininn126 Thanks. Yes, the nom and non-animate acc sg come from the pagename. Benwing2 (talk) 19:38, 11 February 2024 (UTC)
- @Vininn126 I have added all the irregular forms that use
{{pl-decl-adj}}
(some of which BTW have incorrect declensions currently). Hopefully there are no more issues. Let me know if you find any. At some point we should switch over to using this new module. Benwing2 (talk) 00:38, 12 February 2024 (UTC)- @Benwing2 My last question is how can we add the archaic dative? Otherwise we could probably bot replace. Vininn126 (talk) 08:49, 12 February 2024 (UTC)
- @Vininn126 Hmm, I will add an option for that. What is it exactly, is it dat masc/neut sg and should it just be an additional form in the same slot, with the appropriate footnote? Which adjectives use it? Benwing2 (talk) 09:04, 12 February 2024 (UTC)
- @Benwing2 Perhaps an additional slot at the bottom. Basically it only appears in constructions like po polsku. Vininn126 (talk) 09:05, 12 February 2024 (UTC)
- @Vininn126 OK. I implemented it for now as an additional form in the same slot with a footnote (see User:Benwing2/test-pl-adecl) as that's how other archaic and rare forms are generally handled, but if you would prefer it as a separate line at the bottom I can do that with a bit more work. Benwing2 (talk) 09:11, 12 February 2024 (UTC)
- Eh, being with dative is probably fine. With the bot replacement we should check for any existing olddat parameters as some might be faulty. Vininn126 (talk) 09:12, 12 February 2024 (UTC)
- @Vininn126 OK, sounds good. BTW I have started on the Polish noun module. I'm using the code in the current Module:pl-noun to get an idea of how to decline the different types of nouns, and following the structure of Module:cs-noun. Hopefully I can get something up and running in a week or so, although it will take longer to work out all the kinks. Benwing2 (talk) 09:28, 12 February 2024 (UTC)
- @Benwing2 I imagine. I remember that ugly site had tons of patterns for specifically nouns as well. I think the Wikipedia article on Polish morphology has a more condensed version. When would you be willing to send your bot to replace the old adjective module? Vininn126 (talk) 09:34, 12 February 2024 (UTC)
- @Vininn126 In a day or so, I think. Benwing2 (talk) 09:36, 12 February 2024 (UTC)
- @Vininn126 See User:Benwing2/pl-decl-adj-auto-olddat. This lists all the occurrences of
|olddat=
. Can you take a look? There are 504 entries so you don't need to review every one, but please scan them. Almost all are adjectives in -ski/-cki/-zki but there's also prosty at least. Some of them say|olddat=1
or|olddat=yes
but a lot say|olddat=ku
and several (esp. near the bottom) contain an adjective form as the value. Some of these adjective forms look wrong to me as they end in -ki or -cy instead of -ku. Benwing2 (talk) 22:33, 12 February 2024 (UTC)
- @Vininn126 See User:Benwing2/pl-decl-adj-auto-olddat. This lists all the occurrences of
- @Vininn126 In a day or so, I think. Benwing2 (talk) 09:36, 12 February 2024 (UTC)
- @Benwing2 I imagine. I remember that ugly site had tons of patterns for specifically nouns as well. I think the Wikipedia article on Polish morphology has a more condensed version. When would you be willing to send your bot to replace the old adjective module? Vininn126 (talk) 09:34, 12 February 2024 (UTC)
- @Vininn126 OK, sounds good. BTW I have started on the Polish noun module. I'm using the code in the current Module:pl-noun to get an idea of how to decline the different types of nouns, and following the structure of Module:cs-noun. Hopefully I can get something up and running in a week or so, although it will take longer to work out all the kinks. Benwing2 (talk) 09:28, 12 February 2024 (UTC)
- Eh, being with dative is probably fine. With the bot replacement we should check for any existing olddat parameters as some might be faulty. Vininn126 (talk) 09:12, 12 February 2024 (UTC)
- @Vininn126 OK. I implemented it for now as an additional form in the same slot with a footnote (see User:Benwing2/test-pl-adecl) as that's how other archaic and rare forms are generally handled, but if you would prefer it as a separate line at the bottom I can do that with a bit more work. Benwing2 (talk) 09:11, 12 February 2024 (UTC)
- @Benwing2 Perhaps an additional slot at the bottom. Basically it only appears in constructions like po polsku. Vininn126 (talk) 09:05, 12 February 2024 (UTC)
- @Vininn126 Hmm, I will add an option for that. What is it exactly, is it dat masc/neut sg and should it just be an additional form in the same slot, with the appropriate footnote? Which adjectives use it? Benwing2 (talk) 09:04, 12 February 2024 (UTC)
- @Benwing2 My last question is how can we add the archaic dative? Otherwise we could probably bot replace. Vininn126 (talk) 08:49, 12 February 2024 (UTC)
- @Vininn126 I have added all the irregular forms that use
- @Vininn126 Thanks. Yes, the nom and non-animate acc sg come from the pagename. Benwing2 (talk) 19:38, 11 February 2024 (UTC)
- @Benwing2 One thing - it should be tamto/to for neuter singular. Vininn126 (talk) 13:54, 11 February 2024 (UTC)
- @Benwing2 Decided to add them to the test cases and it indeed works - I assume the nominative singular/accusative singular will be supplied by the pagename? Vininn126 (talk) 11:28, 11 February 2024 (UTC)
- @Vininn126 For ten/tamten, try
- @Benwing2 zielony should have both, czerwony should not, and słony/słoni. I think we still need cases for ten/tamten, as I mentioned for Kashubian. Vininn126 (talk) 10:54, 11 February 2024 (UTC)
- @Vininn126 You're saying we should generate two forms for nom virile pl for zielony and compounds, one with -oni and one with -eni? What about czerwony and compounds, same? What about słony? BTW what is missing? It should be possible to do jeden just using
- @Benwing2 Dang your fast! I see you have passive participles as taking ablaut by default, but zielony/czerwony (and their compounds) without; I think zielony should have the -e-/-o- ablaut alongside the ablautless version. Otherwise I see you have all the vowel/consonant alternations that take place in the virile, and the rest should be regular and easy. Overall Polish has much fewer vowel alternations than Kashubian or even other Lechitic ethnolects, so this should be easier. Vininn126 (talk) 08:56, 11 February 2024 (UTC)
- @Vininn126 I implemented a Polish adjective module using the same syntax as the Kashubian adjective module (which is also essentially the same syntax used by the Czech, Ukrainian, Belarusian, etc. modules). See User:Benwing2/test-pl-adecl. Benwing2 (talk) 05:23, 11 February 2024 (UTC)
- @Benwing2 Yeah, it's bad. But once you get to the grammar section it's much better, and very, very thorough. Vininn126 (talk) 21:40, 10 February 2024 (UTC)
- @Vininn126 Gack, that design is awful, with panes blocking other panes, flashing marquee stuff, etc. Thanks for all the links. Benwing2 (talk) 21:40, 10 February 2024 (UTC)
- @Benwing2 this is also a pretty good one, to be honest, even though the design is straight from the early Internet (that's how you know it's good). Vininn126 (talk) 10:15, 10 February 2024 (UTC)
- @Benwing2 WSJP and I just realized (the incredibly slow) SGJP is going to be very useful. Doroszewski has some patterns. this would be useful. Otherwise most resources are behind a paywall. Vininn126 (talk) 09:27, 10 February 2024 (UTC)
- Also any URL's for online grammars, even if they're in Polish. Benwing2 (talk) 09:20, 10 February 2024 (UTC)
- @Vininn126 Thanks. What is the URL for WSJP? Also any other URL's for other online resources would be great even if they aren't necessarily as good; it's good to be able to cross-check things like this. Benwing2 (talk) 09:19, 10 February 2024 (UTC)
- @Benwing2 WSJP is definitely going to be your best bet - they have inflections for all inflectional words. I also have a few dictionaries with patterns in them. The Wikipedia article is okay, but I think that one book is going to have the most patterns. Vininn126 (talk) 09:08, 10 February 2024 (UTC)
- @Vininn126 Can you point me to any online grammars of Polish, any Polish equivalents of Sloworz (i.e. Polish dictionaries with inflection tables) and any other Polish dictionaries that have inflection info in them, even if it's partial? Benwing2 (talk) 08:59, 10 February 2024 (UTC)
One other thing concerns short forms like wesół. Currently they have an associated declension table that lists the short form as the lemma and otherwise declines like wesoły. I'm thinking rather we should list wesół as the short form in the wesoły declension table and not include a table under wesół. This is consistent with how short forms are handled in Russian, Czech, etc. See User:Benwing2/test-pl-adecl under wesoły (somewhere in the middle) for an example. What do you think? Benwing2 (talk) 22:55, 12 February 2024 (UTC)
- @Benwing2 agreed, except for some where there is only a short form, of which there aren't many. As to zamężna, it's by far the more "lemmatized" form, but WSJP only has zamężny. We could give a note saying "chiefly in the feminine". I have a question - is it possible to give notes about particular forms with this and future modules? Anything with -ki or -cy should be removed. Anything with olddat=ku and I suppose 1/yes should be kept. Vininn126 (talk) 08:41, 13 February 2024 (UTC)
- @Vininn126 For "notes about particular forms", in general yes. Can you give a use case? You can attach footnotes to overrides using e.g.
{{pl-adecl|<nom_mp_pers:weseli:wesoli[archaic].short:wesół>}}
, which would override the nom_mp_pers (= nom masc pers plural = nom virile plural) with two values and attach a footnote "archaic" to the second one. You can also attach a footnote to all forms using{{pl-adecl|<[footnote]>}}
; this is mostly useful using the alternant syntax where you specify two declensions and put a footnote by the forms of one of them (usually the second). Some inflection templates also support an<addnote:...>
syntax to make it easy to add a given footnote to an arbitrary set of forms based on Lua patterns. This isn't currently implemented here but if you think it's useful I'll add support for it. Benwing2 (talk) 08:48, 13 February 2024 (UTC)- @Vininn126 BTW I ran the conversion script and converted everything with -ki or -cy to use regular
<archdat>
, ignoring the unusual value; this is what the old template did. Let me know if you want the<archdat>
removed from these terms. Benwing2 (talk) 08:50, 13 February 2024 (UTC)- @Vininn126 I also just deleted Template:pl-decl-adj-auto to encourage (lol) people to use the new
{{pl-adecl}}
, and updated WT:About Polish appropriately. Hope that's ok.{{pl-adecl}}
has full documentation, which should help. Benwing2 (talk) 08:55, 13 February 2024 (UTC)- @Benwing2 That's fantastic, thank you. being able to do multiword adjectives is also nice - could you generate me a list of such entries so I can apply the new formatting? Vininn126 (talk) 09:00, 13 February 2024 (UTC)
- @Vininn126 User:Benwing2/pl-adj-multiword. There are 260 such adjectives but many of them are indeclinable. Benwing2 (talk) 09:05, 13 February 2024 (UTC)
- Awesome, thanks for the help. Is there anything else left for adjectives? Vininn126 (talk) 09:11, 13 February 2024 (UTC)
- @Vininn126 We need to convert the remaining uses of
{{pl-decl-adj}}
to use{{pl-adecl}}
. Also I can't delete the old Module:pl-adj quite yet because it's used by Module:pl-noun to decline adjectival nouns. The new noun module will use the new adjective module to decline adjectival nouns. Benwing2 (talk) 09:20, 13 February 2024 (UTC)- @Benwing2 I can work on that. Also, the old module might be used in
{{pl-decl-phrase}}
. Vininn126 (talk) 09:22, 13 February 2024 (UTC)- @Benwing2 I finished with that first list, if you want to delete it. Vininn126 (talk) 09:39, 13 February 2024 (UTC)
- Also there's only one word using the other old template and it's irregular. Vininn126 (talk) 09:42, 13 February 2024 (UTC)
- @Vininn126 I added support for wszystek to
{{pl-adecl}}
and deleted{{pl-decl-adj}}
as well. Benwing2 (talk) 09:48, 13 February 2024 (UTC)- BTW
{{pl-decl-phrase}}
will go away once the new{{pl-ndecl}}
is working because it will support multiword phrases out of the box. Benwing2 (talk) 20:36, 13 February 2024 (UTC)- @Benwing2 sounds good to me. Including noun + adjective combinations? Vininn126 (talk) 20:43, 13 February 2024 (UTC)
- @Vininn126 Yes, it will do adj+noun, noun+adj, adj+adj+noun, etc. combinations; you specify a
<+>
after the adjectives and it inherits the gender, number and animacy of the closest noun. Benwing2 (talk) 20:51, 13 February 2024 (UTC)- Great! Vininn126 (talk) 21:09, 13 February 2024 (UTC)
- @Vininn126 BTW that ugly-ass site with its obnoxious pop-up ads is super helpful with its inflection tables showing all the different possibilities for noun inflection. Hopefully once it comes to Kashubian I can get pretty far by looking up the Kashubian equivalent of the nouns in the table. (It looks like a lot of the patterns aren't properly handled by the current noun module, though.) Benwing2 (talk) 00:25, 14 February 2024 (UTC)
- Also can you help me understand the difference between the plural MW and Mx rows in [1]? These appear to be masculine personal nouns, where the M here means "nominative plural" and the endings are different, but I'm not sure why. There's a comment in the containing page [2] that reads this: "WARNING: In the adjective declension the coloured nominative plural forms (Mx) remain clearly theoretical in general and are not used at all, similarly like in some other cases, e.g. chłopce, dziadzie." But sometimes the MW row is missing, as with nierób, głodomór, kiep and pędziwiatr. (The corresponding Polish text says "UWAGA: W deklinacji przymiotnikowej formy nacechowane mianownika liczby mnogiej (Mx) są w zasadzie czysto teoretyczne i w ogóle nieużywane, podobnie jak i w niektórych innych przypadkach, np. chłopce, dziadzie" which seems to mean the same thing.) Benwing2 (talk) 05:05, 14 February 2024 (UTC)
- Also I notice on karzeł that (a) it seems to be using the anim gender to mean animal, which is incorrect (there is an anml gender for this purpose); (b) according to the meanings, this term can be either personal, animal or inanimate, and presumably they have different declensions and should be split into three sections. (Or at least, this is parallel to how Ukrainian is handled, which seems the closest Slavic parallel as it also has a three-way animacy distinction that affects several parts of the declension. In Russian there's only a two-way animacy distinction and it only affects the accusative singular and plural, so "bianimate" nouns can be handled in a single table with separate animate and inanimate rows in the accusative.) Benwing2 (talk) 05:20, 14 February 2024 (UTC)
- Ahahaha, your first comment made me laugh.
- I believe he's referring to what we call the "deprecative" form, in which a masculine personal noun is "downgraded" to a masculine animal noun - a lot of words, especially derogatory ones, won't have that personal ending, and will only have the animal ending. I think we should have it on by default for masculine personal nouns - it's falling out of use but it is part of the standard.
- Well in Polish linguistics it's often called "męskozwierzęcy", literally "masculine animal", as the largest semantic class following this group is animals (but also some games, dances, brand names, among others...) Vininn126 (talk) 10:17, 14 February 2024 (UTC)
- Also I notice on karzeł that (a) it seems to be using the anim gender to mean animal, which is incorrect (there is an anml gender for this purpose); (b) according to the meanings, this term can be either personal, animal or inanimate, and presumably they have different declensions and should be split into three sections. (Or at least, this is parallel to how Ukrainian is handled, which seems the closest Slavic parallel as it also has a three-way animacy distinction that affects several parts of the declension. In Russian there's only a two-way animacy distinction and it only affects the accusative singular and plural, so "bianimate" nouns can be handled in a single table with separate animate and inanimate rows in the accusative.) Benwing2 (talk) 05:20, 14 February 2024 (UTC)
- Also can you help me understand the difference between the plural MW and Mx rows in [1]? These appear to be masculine personal nouns, where the M here means "nominative plural" and the endings are different, but I'm not sure why. There's a comment in the containing page [2] that reads this: "WARNING: In the adjective declension the coloured nominative plural forms (Mx) remain clearly theoretical in general and are not used at all, similarly like in some other cases, e.g. chłopce, dziadzie." But sometimes the MW row is missing, as with nierób, głodomór, kiep and pędziwiatr. (The corresponding Polish text says "UWAGA: W deklinacji przymiotnikowej formy nacechowane mianownika liczby mnogiej (Mx) są w zasadzie czysto teoretyczne i w ogóle nieużywane, podobnie jak i w niektórych innych przypadkach, np. chłopce, dziadzie" which seems to mean the same thing.) Benwing2 (talk) 05:05, 14 February 2024 (UTC)
- @Vininn126 BTW that ugly-ass site with its obnoxious pop-up ads is super helpful with its inflection tables showing all the different possibilities for noun inflection. Hopefully once it comes to Kashubian I can get pretty far by looking up the Kashubian equivalent of the nouns in the table. (It looks like a lot of the patterns aren't properly handled by the current noun module, though.) Benwing2 (talk) 00:25, 14 February 2024 (UTC)
- Great! Vininn126 (talk) 21:09, 13 February 2024 (UTC)
- @Vininn126 Yes, it will do adj+noun, noun+adj, adj+adj+noun, etc. combinations; you specify a
- @Benwing2 sounds good to me. Including noun + adjective combinations? Vininn126 (talk) 20:43, 13 February 2024 (UTC)
- BTW
- @Vininn126 I added support for wszystek to
- Also there's only one word using the other old template and it's irregular. Vininn126 (talk) 09:42, 13 February 2024 (UTC)
- @Benwing2 I finished with that first list, if you want to delete it. Vininn126 (talk) 09:39, 13 February 2024 (UTC)
- @Benwing2 I can work on that. Also, the old module might be used in
- @Vininn126 We need to convert the remaining uses of
- Awesome, thanks for the help. Is there anything else left for adjectives? Vininn126 (talk) 09:11, 13 February 2024 (UTC)
- @Vininn126 User:Benwing2/pl-adj-multiword. There are 260 such adjectives but many of them are indeclinable. Benwing2 (talk) 09:05, 13 February 2024 (UTC)
- @Benwing2 That's fantastic, thank you. being able to do multiword adjectives is also nice - could you generate me a list of such entries so I can apply the new formatting? Vininn126 (talk) 09:00, 13 February 2024 (UTC)
- @Vininn126 I also just deleted Template:pl-decl-adj-auto to encourage (lol) people to use the new
- @Vininn126 BTW I ran the conversion script and converted everything with -ki or -cy to use regular
- @Vininn126 For "notes about particular forms", in general yes. Can you give a use case? You can attach footnotes to overrides using e.g.
@Vininn126 Please see Module:User:Benwing2/pl-noun-examples.lua. I compiled this based on the tables in Grzegorz Jagodziński's ugly site. For each noun I list all the unpredictable forms along with an English gloss. This helps me tremendously in figuring out the extent of variation of stem and ending alternants and how to structure the declension module. We need to do something similar for Kashubian; not necessarily as detailed but it needs to cover the corresponding variants as much as possible. Can you help me with this? As a first step can you find the Kashubian cognate of each Polish term (to the extent one exists)? That way we can look up the term in Sloworz and/or your Kashubian dicionaries. Thanks for any help you can give. Benwing2 (talk) 04:39, 15 February 2024 (UTC)
- @Benwing2 I can work on something similar, sure. Vininn126 (talk) 08:54, 15 February 2024 (UTC)
- @Vininn126 I am looking through SGJP, which BTW is an awesome resource with even more sets of tables than Grzegorz's site. One thing I'm curious about is "neutral" vs. "characterized" endings in the fem gen pl. For example, podłóż here is indicated as having only a neutral gen pl, but loża here has only a characterized gen pl 'lóż'. Meanwhile Maja (the female name, not "Mayan") here is given with both a neutral Mai labeled hom. and a characterized Maj labeled char.. From the names of the variants I take it this is some register distinction but I'm not exactly sure what. Can you help explain and do you know whether we need to include both forms (to the extent they both exist)? Benwing2 (talk) 09:42, 15 February 2024 (UTC)
- @Benwing2 According to their list of abbreviations, hom. means "homonymic". They also give "marked" and "unmarked" for char. and neut. I'm not actually too sure what's going on here; @Silmethule, could you give some insight? Vininn126 (talk) 09:50, 15 February 2024 (UTC)
- Thanks! BTW I expanded Module:User:Benwing2/pl-noun-examples.lua with a list of all the types of feminine cons-stem nouns in SGJP along with their counts (look at the end). I did this originally to determine whether to default the nom pl to -e or -i (it seems it should be -e always except for nouns in -ość) but in the process I added a column for how the indicator spec would look like in the new
{{pl-ndecl}}
. This should give you an idea of how the specs look like for different sorts of nouns. In general the gender is optional, but most of these specs need the gender specified as otherwise it will probably default to masculine (but there will be a rule that nouns in -ość default to feminine as there are 64,000+ of them). You can see the use of gender indicators likef
, the plurale tantum indicatorpl
, the reducibility ("fleeting e") indicator*
, the ó/o and ą/ę alternation indicator#
, override indicators likeinsplmi
(use -mi in the ins pl) andinsplami:mi
(use either -ami or -mi in the ins pl) and the use ofdecllemma:...
in conjunction with cześć (which has irregular reducibility). Note also the following indicator:((Wielkanoc<f>,Wielka<+>noc<f.[rare]>))
. This says that Wielkanoc declines either as a regular feminine consonant-stem noun with gen_sg Wielkanocy, or an amalgamation of Wielka (declined adjectivally) and noc (a regular feminine consonant-stem noun), where the latter is rare. The syntax((...,...))
is called an alternation and lets you pack two arbitrarily different declensions into a single table. The one thing I'm not certain about is the handling of plurale tantum nouns. Even though the agreement pattern is only virile/non-virile, it seems to me their actual declension is different depending on whether they are related to singular nouns that are masc, fem or neut, so for now I am requiring that a gender be specified. This made total sense for Czech where the agreement in the plural really is different for masc animate, masc inanimate, fem and neut but here it's more murky. This probably needs rethinking. Benwing2 (talk) 10:06, 15 February 2024 (UTC)- @Benwing2 Complicated! And I have a lot of work ahead me for Kashubian (it might take me a while to figure out these patterns as I don't think anyone has sat down with these patterns before).
- If I understand correctly, the module will be able to mostly predict the gender from the pagename without any supplied parameters? As for plurale tantum nouns, I wonder if it might be best to give an argument
|pl-m-pr=
,|pl-f=
or something like that. I think they will be harder to predict from the page name. Vininn126 (talk) 10:15, 15 February 2024 (UTC)- @Vininn126 Yeah, it is a bit complex but Polish declensions themselves are rather complex, and I think you will find the new system quite logical once you get used to it. I do appreciate any work you can do for Kashubian. As for plurale tantum nouns, it sounds like what you're proposing is essentially what I'm planning on doing, which is to require the gender to be specified for them along with an indication that they are plurale tantum nouns. Benwing2 (talk) 11:05, 15 February 2024 (UTC)
- @Benwing2 I'm sure it will be - a complicated system needs a complicated solution. And I'm really not sure there's another way to tell the code otherwise - it will still be better than what we have now. The problem is that tant. pl. nouns can look like nominative singular forms, and there aren't really a ton of patterns for them. One that comes to mind is that -alia when tant. pl. is almost always neuter. Vininn126 (talk) 11:06, 15 February 2024 (UTC)
- @Vininn126 OK, if that's the case I can make those default to neuter, similarly to how I will make nouns in -ość default to feminine (in Russian we have similar defaults but require gender to be specified for plurale tantum nouns as well as nouns in -ь except for those with certain recognized endings, e.g. masculine -тель, feminine -ость, a few others maybe). The analogous thing for Polish would be to require gender to be specified for nouns ending in a soft or "formerly soft" consonant such as ń/ś/ć/ź/c/ż/j/l/cz/dz/sz/rz, again with certain common endings defaulted. I don't know if this makes sense to do. Benwing2 (talk) 11:13, 15 February 2024 (UTC)
- @Benwing2 Most likely yes - with an optional parameter to tell it that it's a masculine soft pattern instead. Vininn126 (talk) 11:16, 15 February 2024 (UTC)
- @Vininn126 OK, if that's the case I can make those default to neuter, similarly to how I will make nouns in -ość default to feminine (in Russian we have similar defaults but require gender to be specified for plurale tantum nouns as well as nouns in -ь except for those with certain recognized endings, e.g. masculine -тель, feminine -ость, a few others maybe). The analogous thing for Polish would be to require gender to be specified for nouns ending in a soft or "formerly soft" consonant such as ń/ś/ć/ź/c/ż/j/l/cz/dz/sz/rz, again with certain common endings defaulted. I don't know if this makes sense to do. Benwing2 (talk) 11:13, 15 February 2024 (UTC)
- @Benwing2 I'm sure it will be - a complicated system needs a complicated solution. And I'm really not sure there's another way to tell the code otherwise - it will still be better than what we have now. The problem is that tant. pl. nouns can look like nominative singular forms, and there aren't really a ton of patterns for them. One that comes to mind is that -alia when tant. pl. is almost always neuter. Vininn126 (talk) 11:06, 15 February 2024 (UTC)
- @Vininn126 Yeah, it is a bit complex but Polish declensions themselves are rather complex, and I think you will find the new system quite logical once you get used to it. I do appreciate any work you can do for Kashubian. As for plurale tantum nouns, it sounds like what you're proposing is essentially what I'm planning on doing, which is to require the gender to be specified for them along with an indication that they are plurale tantum nouns. Benwing2 (talk) 11:05, 15 February 2024 (UTC)
- @Vininn126, Benwing2: I’m not entirely sure, but having skimmed over their Theoretical Basics I think they mean that Mai in gen. pl. is homonymic with Mai in gen. sg. (ie. they mark pl. forms homonymic with their sg. equivalents). As for Maj – it’s generally not used in colloquial speech, so I guess it’s just marked rarer literary form. Cf. the paradigm for aksjomatyzacja with gen. pl. arch. char. (archaic characteristic/marked) aksjomatyzacyj // Silmeth @talk 10:19, 15 February 2024 (UTC)
- See §3.1.5 Uniformizm:
- „Mianowicie w dopełniaczu liczby mnogiej niektórych rzeczowników żeńskich występują warianty (często o ograniczonej wymienności): możemy użyć formy homonimicznej (uniforemnej), która jest synkretyczna dla tego przypadka oraz kilku przypadków liczby pojedynczej (hom, np. funkcji, teorii, kopalni), albo formy charakterystycznej (nieuniforemnej, char, np. funkcyj, teoryj, kopalń). Opozycję tę nazywamy uniformizmem.
- Funkcje tych wariantów (jeżeli istnieją) są różne. Dla wielu rzeczowników są one używane wymiennie (np. głuszy — głusz, kniei — kniej). Dla niektórych są zróżnicowane i zdarza się, że jeden z nich jestprzez normatywistów oceniany negatywnie (np. alej). W wypadkach najbardziej wyrazistych staramy się pokazywać takie uwarunkowania.
- (…)”
- // Silmeth @talk 10:33, 15 February 2024 (UTC)
- @Silmethule OK thanks! I ran this through Google Translate and they seem to be saying sometimes the two forms are interchangeable but sometimes one or the other is proscribed or has some sort of register difference (e.g. literary vs. colloquial), and sometimes they will indicate this. I guess that's what aksjomatyzacyj means with its arch. label. The old module does indicate for nouns in -Cja, a regular gen pl form in -Cji and an archaic form in -Cyj, and I am planning on keeping this functionality if you guys think it makes sense to do so. Otherwise this might require overrides (for which footnotes can be supplied) to indicate the rarer variants. Benwing2 (talk) 11:01, 15 February 2024 (UTC)
- @Benwing2 Keeping the stylized archaic forms would be good - basically every normative dictionary does this too. Otherwise yes the ability to override and add footnotes for exceptions would be great. Vininn126 (talk) 11:03, 15 February 2024 (UTC)
- @Vininn126 Just FYI I am chugging along. I have gone through all the SGJP declension types, which helps me work out what the default endings should be for which case/number combinations, whether to make a noun reducible and how to implement reducibility, etc. (There are a ton of patterns, esp. for foreign nouns; Polish here is similar to Czech in trying to decline every damn foreign noun according to its pronunciation while keeping the original spelling as much as possible. In Russian this is much less of an issue, because terms gets transcribed into Cyrillic and those that don't fit an obvious existing Slavic declension pattern get made indeclinable.) Benwing2 (talk) 08:27, 19 February 2024 (UTC)
- @Benwing2 Nice. Anything I can do? I'm working on the Kashubian patterns - I've gotten through masculine nouns and I'm having someone double check my work. I don't think we're gonna need instrumental plural since -mi is a pretty rare ending in Kashubian - I do think we might need to check nouns ending in a velar, but I think the altneration of cz/dż > k/g will be entirely regular, too. Vininn126 (talk) 08:29, 19 February 2024 (UTC)
- @Vininn126 Thanks for your work on Kashubian patterns. I don't think there's anything you need to do for Polish right now, though. I'll let you know as I make further progress. Benwing2 (talk) 08:35, 19 February 2024 (UTC)
- @Benwing2 I just realized we may want to move any short forms from alternative forms to the new adjective module with any qualifiers they may have. Vininn126 (talk) 08:44, 19 February 2024 (UTC)
- @Vininn126 Sounds good. I tried to do something of this sort already but I'm sure some got missed. Benwing2 (talk) 09:01, 19 February 2024 (UTC)
- @Benwing2 Compare cały with the Middle Polish short form cał. Vininn126 (talk) 09:02, 19 February 2024 (UTC)
- BTW how can one add a qualifier for a form? Vininn126 (talk) 09:03, 19 February 2024 (UTC)
- @Vininn126 In brackets after the form, e.g.
{{pl-adecl|<short:cał[Middle Polish]>}}
or something. Benwing2 (talk) 09:04, 19 February 2024 (UTC)- @Benwing2 One thing about Polish short forms is they are masculine only - could we have it not occupy the feminine and neuter columns? Vininn126 (talk) 09:24, 19 February 2024 (UTC)
- @Benwing2 Compare cały with the Middle Polish short form cał. Vininn126 (talk) 09:02, 19 February 2024 (UTC)
- @Vininn126 Sounds good. I tried to do something of this sort already but I'm sure some got missed. Benwing2 (talk) 09:01, 19 February 2024 (UTC)
- @Benwing2 I just realized we may want to move any short forms from alternative forms to the new adjective module with any qualifiers they may have. Vininn126 (talk) 08:44, 19 February 2024 (UTC)
- @Vininn126 Thanks for your work on Kashubian patterns. I don't think there's anything you need to do for Polish right now, though. I'll let you know as I make further progress. Benwing2 (talk) 08:35, 19 February 2024 (UTC)
- @Benwing2 Nice. Anything I can do? I'm working on the Kashubian patterns - I've gotten through masculine nouns and I'm having someone double check my work. I don't think we're gonna need instrumental plural since -mi is a pretty rare ending in Kashubian - I do think we might need to check nouns ending in a velar, but I think the altneration of cz/dż > k/g will be entirely regular, too. Vininn126 (talk) 08:29, 19 February 2024 (UTC)
- @Vininn126 Just FYI I am chugging along. I have gone through all the SGJP declension types, which helps me work out what the default endings should be for which case/number combinations, whether to make a noun reducible and how to implement reducibility, etc. (There are a ton of patterns, esp. for foreign nouns; Polish here is similar to Czech in trying to decline every damn foreign noun according to its pronunciation while keeping the original spelling as much as possible. In Russian this is much less of an issue, because terms gets transcribed into Cyrillic and those that don't fit an obvious existing Slavic declension pattern get made indeclinable.) Benwing2 (talk) 08:27, 19 February 2024 (UTC)
- @Benwing2 Keeping the stylized archaic forms would be good - basically every normative dictionary does this too. Otherwise yes the ability to override and add footnotes for exceptions would be great. Vininn126 (talk) 11:03, 15 February 2024 (UTC)
- @Silmethule OK thanks! I ran this through Google Translate and they seem to be saying sometimes the two forms are interchangeable but sometimes one or the other is proscribed or has some sort of register difference (e.g. literary vs. colloquial), and sometimes they will indicate this. I guess that's what aksjomatyzacyj means with its arch. label. The old module does indicate for nouns in -Cja, a regular gen pl form in -Cji and an archaic form in -Cyj, and I am planning on keeping this functionality if you guys think it makes sense to do so. Otherwise this might require overrides (for which footnotes can be supplied) to indicate the rarer variants. Benwing2 (talk) 11:01, 15 February 2024 (UTC)
- Thanks! BTW I expanded Module:User:Benwing2/pl-noun-examples.lua with a list of all the types of feminine cons-stem nouns in SGJP along with their counts (look at the end). I did this originally to determine whether to default the nom pl to -e or -i (it seems it should be -e always except for nouns in -ość) but in the process I added a column for how the indicator spec would look like in the new
- @Benwing2 According to their list of abbreviations, hom. means "homonymic". They also give "marked" and "unmarked" for char. and neut. I'm not actually too sure what's going on here; @Silmethule, could you give some insight? Vininn126 (talk) 09:50, 15 February 2024 (UTC)
- @Vininn126 I am looking through SGJP, which BTW is an awesome resource with even more sets of tables than Grzegorz's site. One thing I'm curious about is "neutral" vs. "characterized" endings in the fem gen pl. For example, podłóż here is indicated as having only a neutral gen pl, but loża here has only a characterized gen pl 'lóż'. Meanwhile Maja (the female name, not "Mayan") here is given with both a neutral Mai labeled hom. and a characterized Maj labeled char.. From the names of the variants I take it this is some register distinction but I'm not exactly sure what. Can you help explain and do you know whether we need to include both forms (to the extent they both exist)? Benwing2 (talk) 09:42, 15 February 2024 (UTC)
declension update
[edit]@Vininn126 BTW I ended up copying all the patterns from SGJP into comments in the declension module I'm working on, with various case forms (e.g. for masculine nouns, the gen_sg, loc_sg, nom_pl, nom_pl_depr and gen_pl). This took a lot of doing (e.g. there are 640 masculine patterns not counting the adjectival ones, which may be another 50 or so) but it really helps in understanding the sort of variation I need to account for. In particular there is an awful lot of variation in the masculine personal/virile nom_pl and in the gen_pl. It seems to me the most common endings for masculine hard-stem nouns are -owie in the personal nom_pl and -ów in the gen_pl but soft-stem nouns are trickier and I haven't yet worked out what the defaults should be. I will also need to come up with a way of handling foreign nouns with silent letters (e.g. Jacques and software), which seem to take an apostrophe before most case endings, and acronyms (e.g. HTML, GUC, CAD), which seem to take a hyphen before most case endings. I ran into this issue in Czech as well and the solution there has a bunch of special casing that doesn't work in all circumstances, but with the benefit of the SGJP patterns I should be able to come up with a general solution.
On another matter, I think we should do a bot run to change all uses of the "animate" (an) gender code in Polish to the "animal" (anml) code. The "animate" code is intended for languages like Russian that have only a two-way animacy distinction (animate/inanimate); languages like Ukrainian and Belarusian that have a 3-way animacy distinction use personal/animal/inanimate, and it seems to me Polish should do the same.
Also I have been thinking how to handle neutral vs. deprecative forms in the masculine personal nom_pl as well as neutral vs. marked forms in the feminine gen_pl. I am thinking the former will be handled using a separate row; essentially the nom_pl for virile nouns will split into two rows, with the neutral form above the deprecative form. However, I think the gen_pl distinction is better handled using just one slot, with two values in the slot when necessary along with a footnote on the marked one indicating that it's marked (or archaic/etc. as the case may be). The reason for this is that all or almost all masculine personal nouns seem to have both neutral and deprecative forms in the nom_pl (although sometimes they are the same), whereas many feminine nouns are missing either the "neutral" or "marked" form, and in such a situation it's not obvious it makes sense to maintain the distinction. Benwing2 (talk) 21:49, 22 February 2024 (UTC)
- @Benwing2 Nicely done. I'm slowly chipping away at the Kashubian forms with some help. I've noticed a lot more ending leveling.
- For soft endings you see -i/-y a lot but also -(i)e. I'm not sure which is more common. As to foreign borrowings, your understanding is correct.
- I agree to switching to animal, and this might also apply to some other Slavic languages.
- WSJP handles deprecative forms similarly, but they also add the archaic/stylized genitive plural on a separate line, but I also understand your reasoning. This is mostly a stylistic thing that might be best to see first and then decide. Vininn126 (talk) 22:07, 22 February 2024 (UTC)
- @Vininn126 Cool. Note that WSJP imports their declension tables from SGJP so it's not surprising they use the same layout. As for "animal", there are 13 languages in Category:Personal nouns by language. Besides Belarusian and Ukrainian (which already use "animal") and Polish, this leaves:
- Carpathian Rusyn (3 animate nouns, 2 of which are actually personal);
- Kashubian (88 animate nouns);
- Masurian (3 animate nouns);
- Old Czech (already handled correctly; 18 animal nouns, no animate nouns);
- Old Polish (no animate nouns);
- Old Ruthenian (110 animate nouns);
- Old Slovak (5 animate nouns, all of which are actually personal);
- Pannonian Rusyn (21 animate nouns);
- Silesian (45 animate nouns);
- Upper Sorbian (5 animate nouns, most of which are actually personal).
- Note that a lot of the above languages are failing to properly mark all their nouns for animacy. Benwing2 (talk) 01:40, 23 February 2024 (UTC)
- @Benwing2 With Old Polish, part of my concern is that it's hard to be 100% sure about the animacy of a given noun. We could probably "reconstruct" it based on the children, but these things change with time. A lot of inanimate nouns become animal in colloquial Polish for example. Otherwise I have no problem switching to animal. An easy fix would be to simply change the appropriate modules/templates so that the abbreviation
m-an
gives "animal" instead of animate. Vininn126 (talk) 08:00, 23 February 2024 (UTC)- @Vininn126 For various reasons I'd prefer to just fix the genders to use
anml
rather than adding a hack like you propose: For one it's just as easy, for two it doesn't work to hack the modules/templates in cases where{{head}}
is called directly instead of the appropriate lang-specific template. BTW I already did this for Upper Sorbian as well as added missing animacies (working currently on Lower Sorbian), and partly did this for Old Ruthenian, with some{{attn}}
's added in places where I wasn't sure the animacy, e.g. if a particular word means both "fox" and "fox fur/fox pelt", is itanml
for both or is the "fox fur/pelt" meaning inanimate? Similarly if a word has both the literal meaning "sheep" (anml
) and the figurative meaning "Christian follower", is it personal in the latter meaning? I assumed yes in the second case (it's personal) but no in the first case (it's still animate). There are{{attn}}
's everywhere I wasn't sure. Benwing2 (talk) 09:03, 23 February 2024 (UTC)- BTW I have discovered that Lower Sorbian "simplified" their adjective concord in a way that, paradoxically, requires that animacies be assigned to nouns of all genders, whereas for Upper Sorbian, like Polish, only masculines need animacy assigned. Benwing2 (talk) 09:04, 23 February 2024 (UTC)
- @Benwing2 Interesting. My work and understanding of the Sorbian twins is rather limited. Also, if it's just as easy, then that seems fine by me. Vininn126 (talk) 09:08, 23 February 2024 (UTC)
- BTW I have discovered that Lower Sorbian "simplified" their adjective concord in a way that, paradoxically, requires that animacies be assigned to nouns of all genders, whereas for Upper Sorbian, like Polish, only masculines need animacy assigned. Benwing2 (talk) 09:04, 23 February 2024 (UTC)
- @Vininn126 For various reasons I'd prefer to just fix the genders to use
- @Benwing2 With Old Polish, part of my concern is that it's hard to be 100% sure about the animacy of a given noun. We could probably "reconstruct" it based on the children, but these things change with time. A lot of inanimate nouns become animal in colloquial Polish for example. Otherwise I have no problem switching to animal. An easy fix would be to simply change the appropriate modules/templates so that the abbreviation
- @Vininn126 Cool. Note that WSJP imports their declension tables from SGJP so it's not surprising they use the same layout. As for "animal", there are 13 languages in Category:Personal nouns by language. Besides Belarusian and Ukrainian (which already use "animal") and Polish, this leaves:
unified Lechitic headword module
[edit]@Vininn126 Just FYI, I wrote a unified Lechitic headword module that should eventually handle all Lechitic languages and I'm currently debugging it. It currently supports Polish, Kashubian, Silesian and Masurian and makes them (as much as possible) have unified interfaces for the various headword templates. It is based on Module:pl-headword with lots of fixes. The participle support is currently Polish-only as I don't know much about how the other languages handle their participles. When I get to using it for Polish there will be a few changes to the headword syntax, most notably that {{pl-verb}}
will require the aspect to be put in |1=
instead of |a=
, for consistency with how other Slavic languages handle this. It will also support biaspectual verbs (code |1=biasp
or |1=both
, as with Russian). Benwing2 (talk) 21:14, 24 February 2024 (UTC)
- @Benwing2 Oh man, I've been thinking of exactly this for a long time. This is a dream come true. I don't think participles should be in the headword. Verbal nouns could be for Kashubian, Masurian, and Old Polish, but Polish verbal nouns and potentially Silesian ones are 100% regular (or at least derivable). Let me know when you deploy it. Also something similar for Old Polish would be nice. Vininn126 (talk) 23:12, 24 February 2024 (UTC)
- I'm assuming this includes nouns, verbs, adjective, and adverbs? Vininn126 (talk) 23:18, 24 February 2024 (UTC)
- Also in what ways do the various languages not line up? Vininn126 (talk) 23:27, 24 February 2024 (UTC)
- It supports nouns, proper nouns, verbs, adjectives, adverbs and (for Polish) participles. Participle support is based on the existing Module:pl-headword, which supports classifying them as adverbial vs. adjectival; for adjectival participles, active vs. passive, and for adverbial participles, contemporary vs. anterior. It autodetects the participle type from the ending if not given. It's in Module:zlw-lch-headword and deployed for Silesian currently. Other than issues with participles, the particular way of forming superlatives and periphrastic comparatives differs from language to language and maybe the possible genders do as well, depending on how animacy concord works with adjectives esp. in the plural. For the moment I'm assuming all languages work like Polish. There isn't yet support for verbal nouns but this can be added. Benwing2 (talk) 00:28, 25 February 2024 (UTC)
- I've deployed it for Masurian and Kashubian as well. Polish is on its way. Documentation for the various templates for the deployed languages is up-to-date. Note a few syntactic changes:
- In place of
b
orbardziej
or whatever to indicate a periphrastic comparative, useperi
. - Aspects for verbs go in
|1=
,|g2=
, ... - For nouns and proper nouns, in place of
vr
andnv
, usevr-p
andnv-p
. - Comparative qualifier params
|q=
,|q2=
, ... have been removed in favor of inline modifiers.
- In place of
- Benwing2 (talk) 08:34, 25 February 2024 (UTC)
- @Benwing2 Ah, I see. Well, both adjectival participles exist Pan-Lechitically, but typically only Polish has the anterior adverbial participle. As for setting up each comparative and superlative, Masurian might be somewhat tricky as there is an East-West split. It would be nice if we could set up a way to automatically supply both based on a single parameter.
- There's also the issue of Slovincian, but that whole thing needs an overhaul and I'm slowly reading what grammars exist for it so I think any changes can wait for the future.
- How do we supply the aspectual pair for a verb, and also can we handle things like frequentatives and the like?
- Finally, will Old Polish? I think it already used the old Polish headword module. Vininn126 (talk) 08:47, 25 February 2024 (UTC)
- @Benwing2 Also I see support for the femeq, but we should add support for the marginal, and currently Polish-specific neuter equivalent. I have at least one term supported by quotes and there are definitely more that could be added. Vininn126 (talk) 08:50, 25 February 2024 (UTC)
- @Vininn126 I'll do Old Polish after I get Polish converted. The aspectual pair for a verb is specified using
|pf=
for perfectives,|impf=
for imperfectives,|freq=
for frequentatives, etc. It should be documented in{{szl-verb}}
etc. Currently the East-West Masurian comparative split is specified using qualifiers, e.g.{{zlw-mas-adv|dáwni<q:Western Masuria>|dáwniéj<q:Eastern Masuria>}}
- Eventually we should make this easier but I think that would be done as part of implementing code to autogenerate the synthetic comparative form, as is done for Russian. I'll enable the participle code for the other languages but only allow anterior adverbial participles for Polish, if that makes sense. I'll still need info on how to autodetect the participle type based on the ending, i.e. what the typical endings are for active adjectival, passive adjectival and adverbial participles. I'll also add neuter equivalent to nouns, that should be easy. Benwing2 (talk) 08:58, 25 February 2024 (UTC)
- @Benwing2 Participles are as follows:
- Polish:
- Active adjectival participle: -ący
- Passive adjectival participle: -(a/o)ny, -ty
- Active adverbial participle: -ąc
- Anterior adverbial participle: -ąwszy, -łszy
- Silesian:
- Active adjectival participle: -ōncy
- Passive adjectival participle: -(a/ō)ny -ty
- Active adverbial participle: -ōnc
- Kashubian
- Active adjectival participle: -ący
- Passive adjectival participle: -(a/o)ny, -ty
- Active adverbial participle: -ąc
- Polish:
- As for the others, I'm not 100% sure. I can find out for Masurian, but I'm less certain about Old Polish. Vininn126 (talk) 09:07, 25 February 2024 (UTC)
- Ah wait, apparently Kashubian does have an anterior adverbial participle -wszë/-łszë Vininn126 (talk) 09:09, 25 February 2024 (UTC)
- @Benwing2
- Kashubian has -ąc and -ącë for the active adverbial, which could pose a problem.
- Masurian:
- Active adjectival participle: -óncÿ
- Passive adjectival participle: -(a/ó)nÿ, -ti
- Active adverbial participle: -ónc(ÿ)
- Anterior adverbial participle: --wsÿ/-łsÿ
- Masurian:
- Vininn126 (talk) 09:18, 25 February 2024 (UTC)
- @Benwing2 Would it be possible to distinguish a relative adjective parameter and possessive adjective parameter for at least Old Polish and Kashubian for noun headwords? Vininn126 (talk) 10:38, 25 February 2024 (UTC)
- @Vininn126 I deployed the new module for Polish, after trying to fix all the errors that would appear. In fact the errors are now appearing as fast as I can fix them but hopefully they should all be flushed out soon. Can you take a look at zabiegły and legły? Not sure what's going on with these participles. Benwing2 (talk) 05:16, 26 February 2024 (UTC)
- @Benwing2 rozkurwić is correct. I see you caught abdach. architekt can be feminine - quite a few words can be both, but the feminine is then indeclinable. wynegocjować is perfective. umurzać is perfective. marudzić is imperfective. tor is inanimate. I'm not sure about fizol - it could indeed be animal and not personal, but I'd suspect personal. Vininn126 (talk) 09:33, 26 February 2024 (UTC)
- @Benwing2 I forgot about that ending - a few passive adjective participles end with -ły, we'll have to update the module. It will be -ły in Silesian and Polish, and -łi in Kashubian and Masurian. Vininn126 (talk) 09:35, 26 February 2024 (UTC)
- Going to sleep now but I will add participle support for the remaining langs in the morning. A couple of questions:
- What about Old Polish? How does it form periphrastic comparatives (like Polish bardziej) and superlatives (like Polish naj-)? What are the participle endings (if you're not sure we can just omit participle support for now)?
- I notice Masurian above has -óncÿ for active adjectival participle but -ónc(ÿ) for active/contemporary adverbial participle. This means -óncÿ is ambiguous. Is this correct? (And should it be "contemporary", as we have it currently, or "active"?) Benwing2 (talk) 10:17, 26 February 2024 (UTC)
- @Benwing2 I cleaned up the other gender requests for Polish - we might want a way to mark an unknown masculine gender for Middle Polish.
- I'm not sure for the periphrastic constructions, the superlative was either na- or naj-, but rare. Most of the time unattested, and I don't think categorizing adjectives and adverbs as not comparable is the best idea since we just simply don't know a lot of the time, as the comparative just simple wasn't attested. We also normally only have the comparative, not the superlative. So we need a lot more flexibility. This might be a good idea for Middle Polish too... I'm also not sure about Old Polish participles, they'd largely be the same except the adjectival would be -ący and adverbial would be -ąc(y), so the same ambiguity.
- That's correct, and for Kashubian as well. I think "contemporary" might be slightly better - we already call the other one "anterior" and also that's what most grammars call it, and that's what it's used for, marking contemporary actions. Vininn126 (talk) 10:31, 26 February 2024 (UTC)
- (btw good night and thanks for the work!)
- @Silmethule found a hapax barziej for Old Polish for a periphrastic comparative. I found nothing for the superlative. So our best approach might be to only have an optional parameter. Vininn126 (talk) 11:12, 26 February 2024 (UTC)
- @Benwing2 It would also be nice to add an
indecl=1
for other parts of speech like adjectives. Vininn126 (talk) 12:03, 26 February 2024 (UTC)- @Benwing2 Having talked with some editors on the discord, it seems Slovincian and Polabian (the other two remaining Lechitic languages) generally need all the same things too (to be honest I think all Slavic languages). I'm not too sure of their orthographical versions of things, especially for participles and adjectives/adverbs, but for nouns and verbs I think we could safely include them, unless you think that's a bad idea. Vininn126 (talk) 17:29, 26 February 2024 (UTC)
- @Vininn126 I've added the other three languages although I haven't yet converted the templates to use the unified module because I need to review the lemmas of each language beforehand. Adjectives already support
|indecl=1
although it's not properly documented. Benwing2 (talk) 21:01, 26 February 2024 (UTC)- @Benwing2 Great! It's very satisfying unifying these things. Can we categorize indeclinable adjectives and display "indeclinable" in the headword? Vininn126 (talk) 21:03, 26 February 2024 (UTC)
- @Vininn126 Done. Ultimately we should maybe think about unifying other West Slavic languages (and even South Slavic/East Slavic) but there are lots of additional issues that come up. For now we have a unified Sorbian headword module and a unified Ukrainian/Belarusian headword module but otherwise things are separate. Benwing2 (talk) 21:09, 26 February 2024 (UTC)
- @Benwing2 I definitely agree and I know other editors agree. This is a step towards that and it's definitely something I appreciate, as the main Lechitic editor. This simplifies my life significantly and I've already been enjoying the results of this. Vininn126 (talk) 21:12, 26 February 2024 (UTC)
- @Vininn126 I deployed the unified module for Old Polish and also implemented
{{pl-verb|def=1}}
for defective verbs. Benwing2 (talk) 00:08, 27 February 2024 (UTC)
- @Vininn126 I deployed the unified module for Old Polish and also implemented
- @Benwing2 I definitely agree and I know other editors agree. This is a step towards that and it's definitely something I appreciate, as the main Lechitic editor. This simplifies my life significantly and I've already been enjoying the results of this. Vininn126 (talk) 21:12, 26 February 2024 (UTC)
- @Vininn126 Done. Ultimately we should maybe think about unifying other West Slavic languages (and even South Slavic/East Slavic) but there are lots of additional issues that come up. For now we have a unified Sorbian headword module and a unified Ukrainian/Belarusian headword module but otherwise things are separate. Benwing2 (talk) 21:09, 26 February 2024 (UTC)
- @Benwing2 Great! It's very satisfying unifying these things. Can we categorize indeclinable adjectives and display "indeclinable" in the headword? Vininn126 (talk) 21:03, 26 February 2024 (UTC)
- @Vininn126 I've added the other three languages although I haven't yet converted the templates to use the unified module because I need to review the lemmas of each language beforehand. Adjectives already support
- @Benwing2 Having talked with some editors on the discord, it seems Slovincian and Polabian (the other two remaining Lechitic languages) generally need all the same things too (to be honest I think all Slavic languages). I'm not too sure of their orthographical versions of things, especially for participles and adjectives/adverbs, but for nouns and verbs I think we could safely include them, unless you think that's a bad idea. Vininn126 (talk) 17:29, 26 February 2024 (UTC)
- @Benwing2 It would also be nice to add an
- Going to sleep now but I will add participle support for the remaining langs in the morning. A couple of questions:
- @Benwing2 I forgot about that ending - a few passive adjective participles end with -ły, we'll have to update the module. It will be -ły in Silesian and Polish, and -łi in Kashubian and Masurian. Vininn126 (talk) 09:35, 26 February 2024 (UTC)
- @Benwing2 rozkurwić is correct. I see you caught abdach. architekt can be feminine - quite a few words can be both, but the feminine is then indeclinable. wynegocjować is perfective. umurzać is perfective. marudzić is imperfective. tor is inanimate. I'm not sure about fizol - it could indeed be animal and not personal, but I'd suspect personal. Vininn126 (talk) 09:33, 26 February 2024 (UTC)
- @Vininn126 I deployed the new module for Polish, after trying to fix all the errors that would appear. In fact the errors are now appearing as fast as I can fix them but hopefully they should all be flushed out soon. Can you take a look at zabiegły and legły? Not sure what's going on with these participles. Benwing2 (talk) 05:16, 26 February 2024 (UTC)
- @Benwing2 Would it be possible to distinguish a relative adjective parameter and possessive adjective parameter for at least Old Polish and Kashubian for noun headwords? Vininn126 (talk) 10:38, 25 February 2024 (UTC)
- Ah wait, apparently Kashubian does have an anterior adverbial participle -wszë/-łszë Vininn126 (talk) 09:09, 25 February 2024 (UTC)
- @Benwing2 Participles are as follows:
- @Vininn126 I'll do Old Polish after I get Polish converted. The aspectual pair for a verb is specified using
- @Benwing2 Also I see support for the femeq, but we should add support for the marginal, and currently Polish-specific neuter equivalent. I have at least one term supported by quotes and there are definitely more that could be added. Vininn126 (talk) 08:50, 25 February 2024 (UTC)
- I've deployed it for Masurian and Kashubian as well. Polish is on its way. Documentation for the various templates for the deployed languages is up-to-date. Note a few syntactic changes:
- It supports nouns, proper nouns, verbs, adjectives, adverbs and (for Polish) participles. Participle support is based on the existing Module:pl-headword, which supports classifying them as adverbial vs. adjectival; for adjectival participles, active vs. passive, and for adverbial participles, contemporary vs. anterior. It autodetects the participle type from the ending if not given. It's in Module:zlw-lch-headword and deployed for Silesian currently. Other than issues with participles, the particular way of forming superlatives and periphrastic comparatives differs from language to language and maybe the possible genders do as well, depending on how animacy concord works with adjectives esp. in the plural. For the moment I'm assuming all languages work like Polish. There isn't yet support for verbal nouns but this can be added. Benwing2 (talk) 00:28, 25 February 2024 (UTC)
- Also in what ways do the various languages not line up? Vininn126 (talk) 23:27, 24 February 2024 (UTC)
- I'm assuming this includes nouns, verbs, adjective, and adverbs? Vininn126 (talk) 23:18, 24 February 2024 (UTC)
Also, adjectives and adverbs now have a |sup=
parameter to explicitly specify superlative(s). You can specify the value +
to request superlative(s) that are derived from the comparative(s); this is the default. Benwing2 (talk) 00:14, 27 February 2024 (UTC)
- Deployed for Polabian. Benwing2 (talk) 01:24, 27 February 2024 (UTC)
- @Benwing2 Great! Nice work. Vininn126 (talk) 04:54, 27 February 2024 (UTC)
- Note that there are no Slovincian headword templates currently so nothing in the way of creating them using the new module, but since Slovincian has free accent, it might make sense to use 1= for specifying the head including the accent (this is not in the pagename). BTW Friedrich Lorentz seems to have gone really crazy with the diacritics; I'm sure with a bit of thought he could have reduced their number. Benwing2 (talk) 05:05, 27 February 2024 (UTC)
- @Benwing2 In WT:About Slovincian, I am working on an orthography introduced by Sobierajski. The final step is to check the diphthongs. @Sławobóg, how do you feel about having the first parameter for Slovincian be dedicated to the head? Vininn126 (talk) 05:08, 27 February 2024 (UTC)
- @Vininn126 BTW I missed part of your comment above about Old and Middle Polish. Note that by default, the new
{{pl-adj}}
and{{pl-adv}}
don't add to 'LANG uncomparable adverbs/adjectives'; this only happens if you explicitly specify|1=-
. Also the support is now there for specifying the superlative separately from the comparative and in Old Polish the superlative isn't generated by default from the comparative. If it would help, I can add a way of suppressing default superlative generation for Middle Polish. As for your comment "we might want a way to mark an unknown masculine gender for Middle Polish", I think you're referring to unknown or unattested animacy? Note that Module:gender and number currently has a spec?!
that displays as gender unattested and doesn't currently categorize (either in a request category or any other category). We could make a similar thing for unattested animacy, aspect, etc. Benwing2 (talk) 05:26, 27 February 2024 (UTC)- @Benwing2 Both would be useful. Vininn126 (talk) 05:28, 27 February 2024 (UTC)
- Also I am making progress on the Polish declension module although it's slower than I would have liked; Polish declension seems even messier than Czech. For example, I looked into the form of the genitive plural for soft feminine nouns: either (a) -i ending only ("neutral"); (b) null ending only ("characterized"); (c) both. To make sense of the variation I ended up having to distinguish 8 different subtypes of soft feminine nouns:
- in [-jly]a; those in -ja are preceded by a vowel;
- in -ia where -i- just indicates palatalization after [cnsz];
- in -ia where -i- indicates palatalization of the preceding consonant + /j/ and the dative is in -i, specifically after a labial [bfmpvw];
- in -ia where -i- indicates palatalization of the preceding consonant + /j/ and the dative is in -ii; specifically after a labial [bfmpvw], a velar (k/g/ch), or [ln]; rarely in -cia (only Garcia, pron [Garcija]), rarely in -czia (only glediczia, welwiczia), rarely in -dżia (only feredżia, lodżia); none in -[drts]ia;
- in -ia where -i- indicates /j/ following a hard consonant, specifically after [drt]; rarely in -cia: only dacia (pron [daczja]), felicia (pron [felicja]), lancia (pron [lanczja]);
- in -ja where -j- indicates /j/ following a hard consonant, specifically after [csz];
- in -ua;
- in -ni.
- Do you know if there are references (even in Polish) that discuss this sort of thing in detail? The Czech IJP site https://prirucka.ujc.cas.cz/ has a ton of useful discussion about Czech, e.g. such things as which nom pl endings tend to occur with which sorts of endings in masculine nouns. This made my life a lot easier when writing the Czech declension module. Benwing2 (talk) 05:33, 27 February 2024 (UTC)
- @Benwing2 For something with that level of detail is something I'm not sure, unfortunately. I'm not even sure if papers on that exist - they could, would have to do some digging. Vininn126 (talk) 05:38, 27 February 2024 (UTC)
- Most books I have on morphology only go into the basic stuff - the most detailed reviews of classes would unfortunately be SGJP and that ugly-ass website, but if they don't have further detail I'm not sure. Most dictionaries I have also don't go into such detail. Vininn126 (talk) 05:42, 27 February 2024 (UTC)
- OK, I haven't sorted through all the resources on those two sites yet, let me take a look in more detail. Benwing2 (talk) 05:47, 27 February 2024 (UTC)
- Most books I have on morphology only go into the basic stuff - the most detailed reviews of classes would unfortunately be SGJP and that ugly-ass website, but if they don't have further detail I'm not sure. Most dictionaries I have also don't go into such detail. Vininn126 (talk) 05:42, 27 February 2024 (UTC)
- @Benwing2 For something with that level of detail is something I'm not sure, unfortunately. I'm not even sure if papers on that exist - they could, would have to do some digging. Vininn126 (talk) 05:38, 27 February 2024 (UTC)
- @Benwing2 I'll have to go through Old Polish nouns next and do the same as for Middle Polish. Perhaps in some cases it might make sense to "reconstruct" based on the child languages, but I'm not always a fan of this as gender can change over time. Vininn126 (talk) 06:21, 27 February 2024 (UTC)
- @Vininn126 I think it's ok, but do we need that with Sobierajski ortography? Sławobóg (talk) 15:26, 27 February 2024 (UTC)
- I'm not sure - I think Sobierajski also has tones. Vininn126 (talk) 15:28, 27 February 2024 (UTC)
- @Vininn126 Since Slovincian had free stress, then logically Sobierajski orthography should have a way of marking the stress; otherwise we'll have to add it ourselves as it's important. Then the only consideration is whether to include the stress mark in the pagename (which is possible but contrary to the way it's done for other Slavic — and Baltic — languages, although AFAIK we do include the stress mark in the pagename of Proto-Balto-Slavic and PIE terms). Benwing2 (talk) 20:04, 27 February 2024 (UTC)
- @Sławobóg Ping. Benwing2 (talk) 20:05, 27 February 2024 (UTC)
- @Benwing2 That's another good point - stress is another aspect, so having a parameter IMO makes sense. I suppose we can change it later if we decide not, but I don't think that's gonna happen. Vininn126 (talk) 20:07, 27 February 2024 (UTC)
- @Benwing2 @Vininn126 Sobierajski gives only place of stress (bjalˈawy, biḱ GEN bˈika) and Lorentz gives 5 types of stress/tones (bjalãvï, bḯḱ GEN bï̂kă). Polish scholars reject Lorentz's tones, but Moscow Accentological School (not sure about that) accept it (compare *sǫditi). We can't really tell who is right here, we probably should keep Lorentz's tones somewhere (and Lorent'z notation without link). Sławobóg (talk) 21:22, 27 February 2024 (UTC)
- @Sławobóg @Vininn126 If we use Sobierajski's spelling as the normative form, we can put the Lorentz spellings under ==Alternative forms== (or alternatively in the headword, using an additional param). Benwing2 (talk) 21:26, 27 February 2024 (UTC)
- @Benwing2 Having an alternative head could be interesting. One problem is Sobierajski's dictionary only went up to C (but it did have some derived terms), so spellings would be intuited based on the system. Vininn126 (talk) 21:38, 27 February 2024 (UTC)
- @Vininn126 Did Lorentz include any justification or discussion of the tones and other fine distinctions he made? Benwing2 (talk) 21:40, 27 February 2024 (UTC)
- @Benwing2 His intention was to be as precise as possible for various reasons, I think as a point of pride. His orthography is incredibly narrow, Sobierajski is much more broad and phonemic. Vininn126 (talk) 21:45, 27 February 2024 (UTC)
- @Benwing2 I'm currently working on reworking the Wikipedia article for Slovincian so we can have a better understanding of it. That means at the moment painstakingly translating Lorentz's work and then making sense of it, and then adding other existing sources. I'm about 1/4 through Lorentz's grammar. Once that's done we can look at Sobierajski's dictionary and switch over existing entries. Vininn126 (talk) 08:20, 28 February 2024 (UTC)
- @Vininn126 OK great, sounds good. Note that I did some more work yesterday evening on Polish feminine nouns and I think I've worked out exactly what needs to be done. Words that end in -ia preceded by a labial or n and take gen/dat in -ii (e.g. fobia, mania) will use an indicator
<stemij>
because in my view they have an underlying stem ending /ij/ that compresses to /j/ ~ /i/ (spelled <i>) in the nom sg. Those ending another consonant + -ia won't need this because whether the gen/dat is -i or -ii can be autodetected. There will also be appropriate defaults for whether the gen pl is neutral, characterized/marked or both (I have worked these out but they are a bit complex and maybe can be simplified). You can override which gen pl form appears by using one or more of<+fneut>
,<-fneut>
,<+fchar>
or<-fchar>
, or by overriding the gen pl directly. Internally there will be separate slots maintained for the neutral and characterized gen pl, but they will probably get combined into a single gen_pl slot (with appropriate footnotes if both forms exist) before they are displayed. I am aiming to do feminine nouns then neuter then masculine, since the masculine is the most complex. Benwing2 (talk) 00:09, 29 February 2024 (UTC)- @Benwing2 That would make sense, as those nouns were originally -ija/-yja in Old/Middle Polish (and still are in Masurian and alternatively in Kashubian), but underwent contraction in Polish. Vininn126 (talk) 08:51, 29 February 2024 (UTC)
- @Benwing2 Actually, checking for other historical developments would probably explain a lot. Vininn126 (talk) 11:56, 29 February 2024 (UTC)
- @Benwing2 That would make sense, as those nouns were originally -ija/-yja in Old/Middle Polish (and still are in Masurian and alternatively in Kashubian), but underwent contraction in Polish. Vininn126 (talk) 08:51, 29 February 2024 (UTC)
- @Vininn126 OK great, sounds good. Note that I did some more work yesterday evening on Polish feminine nouns and I think I've worked out exactly what needs to be done. Words that end in -ia preceded by a labial or n and take gen/dat in -ii (e.g. fobia, mania) will use an indicator
- @Benwing2 I'm currently working on reworking the Wikipedia article for Slovincian so we can have a better understanding of it. That means at the moment painstakingly translating Lorentz's work and then making sense of it, and then adding other existing sources. I'm about 1/4 through Lorentz's grammar. Once that's done we can look at Sobierajski's dictionary and switch over existing entries. Vininn126 (talk) 08:20, 28 February 2024 (UTC)
- @Benwing2 His intention was to be as precise as possible for various reasons, I think as a point of pride. His orthography is incredibly narrow, Sobierajski is much more broad and phonemic. Vininn126 (talk) 21:45, 27 February 2024 (UTC)
- @Vininn126 Did Lorentz include any justification or discussion of the tones and other fine distinctions he made? Benwing2 (talk) 21:40, 27 February 2024 (UTC)
- @Benwing2 Having an alternative head could be interesting. One problem is Sobierajski's dictionary only went up to C (but it did have some derived terms), so spellings would be intuited based on the system. Vininn126 (talk) 21:38, 27 February 2024 (UTC)
- @Sławobóg @Vininn126 If we use Sobierajski's spelling as the normative form, we can put the Lorentz spellings under ==Alternative forms== (or alternatively in the headword, using an additional param). Benwing2 (talk) 21:26, 27 February 2024 (UTC)
- @Sławobóg Ping. Benwing2 (talk) 20:05, 27 February 2024 (UTC)
- @Vininn126 Since Slovincian had free stress, then logically Sobierajski orthography should have a way of marking the stress; otherwise we'll have to add it ourselves as it's important. Then the only consideration is whether to include the stress mark in the pagename (which is possible but contrary to the way it's done for other Slavic — and Baltic — languages, although AFAIK we do include the stress mark in the pagename of Proto-Balto-Slavic and PIE terms). Benwing2 (talk) 20:04, 27 February 2024 (UTC)
- I'm not sure - I think Sobierajski also has tones. Vininn126 (talk) 15:28, 27 February 2024 (UTC)
- @Vininn126 BTW I missed part of your comment above about Old and Middle Polish. Note that by default, the new
- @Benwing2 In WT:About Slovincian, I am working on an orthography introduced by Sobierajski. The final step is to check the diphthongs. @Sławobóg, how do you feel about having the first parameter for Slovincian be dedicated to the head? Vininn126 (talk) 05:08, 27 February 2024 (UTC)
- Note that there are no Slovincian headword templates currently so nothing in the way of creating them using the new module, but since Slovincian has free accent, it might make sense to use 1= for specifying the head including the accent (this is not in the pagename). BTW Friedrich Lorentz seems to have gone really crazy with the diacritics; I'm sure with a bit of thought he could have reduced their number. Benwing2 (talk) 05:05, 27 February 2024 (UTC)
- @Benwing2 Great! Nice work. Vininn126 (talk) 04:54, 27 February 2024 (UTC)
Polish dialects
[edit](Notifying KamiruPL, BigDom, Hythonia, Tashi, Sławobóg, Silmethule): and also @Benwing2 and @Skerillion I propose we set Polish dialects to be LDL's and mention that on this About page and also WT:WDL, much like how Middle Polish is. I will be making many templates for dialectal dictionaries and then I will organize them at Category:Polish reference templates. I also propose we modify Module:labels/data/lang/pl to match w:Dialects of Polish, i.e. we have Krajna, which when added as a label, adds the page to Krajna Polish and also Greater Polish (as a dialect group). I also want to modify Module:zlw-lch-IPA to be able to handle regional pronunciations the same way that Middle Polish is handled, i.e. no rhymes/syllabification. I don't see any issues if we want to allow audio for these, however, as remote of a chance that is. Similarly, we could modify the future Template:pl-pr to have {{{nostandard}}}
so that we don't have 1000x templates. Some dialects should also get the standard pronunciation as well, however. I think most Urban dialects, for instance. Also in the near future I want to make something like Template:Polish regional forms or something like that that will go in the Alternative forms section, which will display a map with dialect boundaries. You will be able to type, for example, |Kociewie=FOO
and that form will display over the Kociewie dialect region, etc.
All this is to say that I'm still not sure how to best handle Goral, or at this point Masurian. These will require a future discussion at WT:RFM, but I also know that Skerillion wants to further expand my, admittedly, brief overview of Goral dialects on Wikipedia. My hope was to just try and create a general overall system that we can expand upon.
Does anyone have any objections to the proposals I've outlined above? Any foreseeable problems? Vininn126 (talk) 11:53, 20 July 2024 (UTC)
- Also @PUC who regularly edits Polish. Vininn126 (talk) 11:54, 20 July 2024 (UTC)
- I have another issue that I forgot to mention: there's also the problem of normalization. Given the fact that dialectal sound changes are sporadic, and for some dialects some changes are stronger, but for others it can vary by speaker, how should we handle normalization? I think that spelling phonetically using a few additional letters, like I gave in the Wikipedia article, might be best. Vininn126 (talk) 12:15, 20 July 2024 (UTC)
- (Also we might want to make Wileń a separate subdialect of Northern Greater Polish. It's quite unique to the area.) Vininn126 (talk) 14:34, 20 July 2024 (UTC)
- @Vininn126 No objections from me. Benwing2 (talk) 18:58, 20 July 2024 (UTC)
- Okay, I have updated Module:labels/data/lang/pl. I hope I didn't fuck anything up (@Benwing2 could you take a look at my recent changes to it please?). I also have many reference templates for regional dictionaries (with more to come). Vininn126 (talk) 22:25, 20 July 2024 (UTC)
- OKay (Notifying KamiruPL, BigDom, Hythonia, Tashi, Sławobóg, Silmethule, Rakso43243, Skerillion): and @Benwing2:
- I've updated WT:About Polish and WT:WDL to say that dialects should be considered LDL's (except urban dialects, which seem to be more like Standard Polish).
- I've made a bunch of templates and organized Category:Polish reference templates.
- I've updated Module:labels/data/lang/pl.
- The next steps are going to be dealing with pronunciation, declension, and the maps.
- For pronunciation, I think I'll be able to update the module with the appropriate information? But also we need 1) the ability to turn off Standard Pronunciation (and in doing so, turn off rhymes and syllabification), 2) Organizing of dialects by dialect group 3) auto-collapsing when there are too many on a page (as in theory you might dialectal pronunciations alongside standard, etc.).
- Declensions are going to take some time and work. But generally we shouldn't be using Standard Polish declensions for dialects, I feel.
- The maps are also going to take some time, and are my second priority alongside pronunciation.
- Please let me know if you have any questions, comments, concerns, or if you disagree, etc. Vininn126 (talk) 17:48, 21 July 2024 (UTC)
- Why do we need the ability to turn off the Standard Pronunciation? I believe standard pronunciation should be given in every entry possible unless it's not attested or was attested in MP only. Tashi (talk) 18:47, 21 July 2024 (UTC)
- Because you don't have "slanted vowels" like you do in Masovian dialects, and some words exist only in, say, Kurpie. Vininn126 (talk) 19:06, 21 July 2024 (UTC)
- A good example would be przetápsiać, where you can see that it has slanted a. Furthermore it's a regional alt form of przetapiać. Vininn126 (talk) 19:10, 21 July 2024 (UTC)
- I think you misunderstood. The presence of a dialectal form does not automatically turn of standard pronunciation, we just need to if it's only dialectal. Vininn126 (talk) 19:17, 21 July 2024 (UTC)
- A good example would be przetápsiać, where you can see that it has slanted a. Furthermore it's a regional alt form of przetapiać. Vininn126 (talk) 19:10, 21 July 2024 (UTC)
- Because you don't have "slanted vowels" like you do in Masovian dialects, and some words exist only in, say, Kurpie. Vininn126 (talk) 19:06, 21 July 2024 (UTC)
- Why do we need the ability to turn off the Standard Pronunciation? I believe standard pronunciation should be given in every entry possible unless it's not attested or was attested in MP only. Tashi (talk) 18:47, 21 July 2024 (UTC)
- OKay (Notifying KamiruPL, BigDom, Hythonia, Tashi, Sławobóg, Silmethule, Rakso43243, Skerillion): and @Benwing2:
- Okay, I have updated Module:labels/data/lang/pl. I hope I didn't fuck anything up (@Benwing2 could you take a look at my recent changes to it please?). I also have many reference templates for regional dictionaries (with more to come). Vininn126 (talk) 22:25, 20 July 2024 (UTC)
- @Vininn126 No objections from me. Benwing2 (talk) 18:58, 20 July 2024 (UTC)
- (Also we might want to make Wileń a separate subdialect of Northern Greater Polish. It's quite unique to the area.) Vininn126 (talk) 14:34, 20 July 2024 (UTC)
- I have another issue that I forgot to mention: there's also the problem of normalization. Given the fact that dialectal sound changes are sporadic, and for some dialects some changes are stronger, but for others it can vary by speaker, how should we handle normalization? I think that spelling phonetically using a few additional letters, like I gave in the Wikipedia article, might be best. Vininn126 (talk) 12:15, 20 July 2024 (UTC)