Wiktionary:Beer parlour/2018/April
Image captions and descriptions of the representing objects such as paintings
[edit]An editor (@Sgconlaw) likes to place into image captions descriptions of representing objects such as paintings, engravings and the like. Like, he said "An engraving from a 16th-century treatise by Levinus Hulsius" in an image at triangulation before I removed that. In that entry, we now have a to-and-fro.
I oppose this practice. I oppose having a description of the representing object (painting, engraving) even in the note created via <ref>. Such a description can be found on Commons. Wiktionary is a dictionary and its images and captions should help learn about the referents, or in some cases about character strokes. Wiktionary should not contain any marginally relevant tidbits only because they could be interesting. Since, there is no tight relation between the referent and the representing object; anything can be on a painting, and the painting can be in any gallery in the world. It is random noise.
Your thoughts?
--Dan Polansky (talk) 08:51, 1 April 2018 (UTC)
- This is not an April joke on my part. --Dan Polansky (talk) 08:51, 1 April 2018 (UTC)
- I oppose Dan's unnecessariy rigid approach. The information is useful to the reader, and placing it in a footnote is an acceptable compromise between having it in the caption itself and not having it at all. — SGconlaw (talk) 09:28, 1 April 2018 (UTC)
- "Useful to the reader" is not enough; "partains to dictionary" or "befits a dictionary" is required. Like, the possible cures of pneumonia could be useful to the reader, but have no place in a dictionary. And strictly speaking, these tidbits are not useful in the sense in which a knife or a fridge are useful (and definitions, if you are a translator); it just fuels idle curiosity. --Dan Polansky (talk) 09:44, 1 April 2018 (UTC)
- Wiktionary is not a print dictionary; space is not an issue. We are also not talking about reams of information that something like "cures of pneumonia" might entail. I maintain that providing some source information that places an image in context is useful to the reader. And what is wrong with "idle curiosity"? One of the beauties of the Wikipedia project is the serendipity of discovering something else interesting while you are looking for one thing. In any case, I remain guided by any clearly established consensus on the issue. — SGconlaw (talk) 10:06, 1 April 2018 (UTC)
- "Useful to the reader" is not enough; "partains to dictionary" or "befits a dictionary" is required. Like, the possible cures of pneumonia could be useful to the reader, but have no place in a dictionary. And strictly speaking, these tidbits are not useful in the sense in which a knife or a fridge are useful (and definitions, if you are a translator); it just fuels idle curiosity. --Dan Polansky (talk) 09:44, 1 April 2018 (UTC)
- I oppose Dan's unnecessariy rigid approach. The information is useful to the reader, and placing it in a footnote is an acceptable compromise between having it in the caption itself and not having it at all. — SGconlaw (talk) 09:28, 1 April 2018 (UTC)
FWIW, I once again agree with Dan Polansky. --Per utramque cavernam (talk) 11:56, 1 April 2018 (UTC)
- I don't agree: "engraving from a 16th-century treatise" is valuable contextual information. This information is conveniently in one place, I don't have to click through to Commons, wait for the page to load and then scroll through the page to find that information. Also think of a situation where the page is read in an offline reader (e.g. Kiwix), without access to Commons. – Jberkel 12:02, 1 April 2018 (UTC)
- To me, the main point is that knowing it's an "engraving from a 16th-century treatise" is lexicographically irrelevant. I just want our entries to "get to the point", which is offering lexicographical information. I want as much of that as possible, and in that sense do I agree that "space is not an issue"; but I don't want anything else.
- And frankly, I think our attentions are already solicited enough by ten thousands different things that we don't need more of that; I find it actually refreshing to have a single-minded environment, focused on one thing only: words. I don't want more occasions for idle curiosity and serendipity. --Per utramque cavernam (talk) 12:22, 1 April 2018 (UTC)
- As for serendipity, all these definitions alone provide for it: you may have wanted one sense of triangulation, but get a multitude of definitions instead; and then you have derived terms and related terms to explore further if you are in an explorative mood. Or click on a category to get more items. All lexicographical. --Dan Polansky (talk) 13:03, 1 April 2018 (UTC)
- I agree with Dan. I think the information is best included under "References," not in the caption, where the sole purpose of the picture is to illustrate a definition. I don't care who took the photo, or painted the picture, or carved the statue, I just care about the lexicographical information. Andrew Sheedy (talk) 19:32, 1 April 2018 (UTC)
- Me too. The caption should be as simple as possible and lexicographically focused. The caption currently on that entry ("People determining the width of a river by triangulation (sense 1)") is ideal, although I would like to get rid of the non-lexicographic information altogether, not even placing it in a footnote. People can click through to the file description page if they want to know more about the picture itself. This, that and the other (talk) 03:32, 2 April 2018 (UTC)
- I agree with SGconlaw and Jberkel. I don't see why images should be 100% about lexicographical information without adding any other information. The added information provides useful and interesting context and takes up so little space that it hardly distracts or obscures anything lexicographical. — Mnemosientje (t · c) 03:40, 2 April 2018 (UTC)
- Images should be 100% lexicographical because Wiktionary should be 100% lexicographical. From Wiktionary:Criteria_for_inclusion#Wiktionary_is_not_an_encyclopedia: "Care should be taken so that entries do not become encyclopedic in nature; if this happens, such content should be moved to Wikipedia, but the dictionary entry itself should be kept. ¶ Wiktionary articles are about words, not about people or places. Articles about the specific places and people belong in Wikipedia."
- In this revision, in my browser, the caption is almost two times as tall as the image itself, and it forces definitions to wrap to a next line before the end of the page. --Dan Polansky (talk) 10:13, 2 April 2018 (UTC)
- I don't think the WT:CFI is very apt in this context. It is clearly talking about Wiktionary entries that essentially become Wikipedia articles. The issue at hand is whether it is appropriate to provide some sourcing information for images used in entries in a "References" section. Plus, browser settings vary from user to user; on my browser the same caption is nowhere near even half the height of the image. I also do not see why definitions wrapping to another line is an issue. The text remains entirely readable, and we have other forms of content such as example boxes that also cause line wrapping. — SGconlaw (talk) 10:56, 2 April 2018 (UTC)
- Definition line wrapping is acceptable when caused by lexicographical content; it is annoying when caused by non-lexicographical random tidbits added to make the entry more artificially "interesting", to people who do not find lexicographical information interesting enough. The CFI passage may not have been intended for image captions but rather for definitions, but that does not change the impact and significance of its wording, which is: let Wiktionary show lexicographical content, and none other. --Dan Polansky (talk) 11:25, 2 April 2018 (UTC)
- I don't think the WT:CFI is very apt in this context. It is clearly talking about Wiktionary entries that essentially become Wikipedia articles. The issue at hand is whether it is appropriate to provide some sourcing information for images used in entries in a "References" section. Plus, browser settings vary from user to user; on my browser the same caption is nowhere near even half the height of the image. I also do not see why definitions wrapping to another line is an issue. The text remains entirely readable, and we have other forms of content such as example boxes that also cause line wrapping. — SGconlaw (talk) 10:56, 2 April 2018 (UTC)
- I don't see why longer image captions should be considered a problem, I have done this myself on occasion. In fact captions can be used to supplement the definition, and I don't see why DP is picking on Sgconlaw in particular. It seems to be DP's pet hate. DonnanZ (talk) 12:01, 2 April 2018 (UTC)
- I've called Sgconlaw into the discussion via a ping since almost all instances of the problem that I have seen were from him, and I found it only fair for him to join the discussion. Captions should not supplement definitions; if definitions are incomplete, they should be expanded. Moving encyclopedic content from definitions to captions is still in violation of WT:CFI as formulated. Moreover, encyclopedic content about the referent is still more worthwhile than telling us the author of a painting. --Dan Polansky (talk) 12:07, 2 April 2018 (UTC)
- I don't see why longer image captions should be considered a problem, I have done this myself on occasion. In fact captions can be used to supplement the definition, and I don't see why DP is picking on Sgconlaw in particular. It seems to be DP's pet hate. DonnanZ (talk) 12:01, 2 April 2018 (UTC)
- I don't care for the extras, which remind me of what you find on a museum exhibit. I would rather not have them, but if we are going to include them, make them into alt text that shows only when you hover over the image. Does the standard image markup have a parameter for this, or do we need to have a template that provides the option? I know it can be done in html, but that would clutter up the wikitext and make it less accessible to those who don't know html. Chuck Entz (talk) 14:18, 2 April 2018 (UTC)
- Wouldn’t putting such information in a footnote that appears in the “References” section be a reasonable compromise? — SGconlaw (talk) 14:34, 2 April 2018 (UTC)
- Re: "space is not an issue". Screen space often is an issue. (Download time can be an issue, though long captions don't have a material effect.) If the problem is screen space we could resort to show/hide bars to have it both ways: Lexicographical content in the bar, non-lexicographical content hidden by default.
- Definitions are not the sole kind of lexicographical content which an image (or sound, etc) can support: I use images that provide some support for the semantic etymology of a term. The first image at Godiva will serve to illustrate why Godiva quadricolor has the specific epithet it does and, less clearly, why the genus name. DCDuring (talk) 15:07, 2 April 2018 (UTC)
- @Sgconlaw, I would be happy with that compromise. Andrew Sheedy (talk) 15:59, 2 April 2018 (UTC)
- Wouldn’t putting such information in a footnote that appears in the “References” section be a reasonable compromise? — SGconlaw (talk) 14:34, 2 April 2018 (UTC)
- I was thinking the other day the ability to show or hide captions would be a good idea, with "hide" as the default setting. Would that please everybody, DP even? DonnanZ (talk) 17:32, 2 April 2018 (UTC)
- It might just be a little confusing where there are several pics in a row for different senses of a word. Equinox ◑ 17:37, 2 April 2018 (UTC)
- @Equinox: Would it be more confusing than having the encyclopedic caption? or just more confusing than a caption with only lexicographic information? DCDuring (talk) 18:59, 2 April 2018 (UTC)
- It might just be a little confusing where there are several pics in a row for different senses of a word. Equinox ◑ 17:37, 2 April 2018 (UTC)
- Here I'm talking about any caption versus none at all, not about the specific content. It's a matter of distinguishing senses. Equinox ◑ 22:23, 2 April 2018 (UTC)
- They are not references, and do not belong in the "References" section.
- You could make them part of a "Notes" section, using
<ref group="note">BLAH</ref> / <references group="note"/>
. —Suzukaze-c◆◆ 03:11, 26 April 2018 (UTC)
- A major benefit of hypertext over earlier forms of text is the ability to follow hyperlinks to learn more about something. These are optional side routes. I might see a pic and think "I wonder who took that photograph for Mediawiki", or "I wonder which year that oil painting was done in", but those are diversions; they are what clicking and linking are for. We should not put that lexicographically irrelevant info directly into the entry. Equinox ◑ 22:24, 2 April 2018 (UTC)
- The primary purpose of images in dictionaries is to illustrate senses. It might be interesting to know more about the image besides what is visible, but I believe such information is noncrucial, belongs elsewhere, and only ends up becoming extra clutter on our entries. I agree with Equinox. —Suzukaze-c◆◆ 03:18, 26 April 2018 (UTC)
Why are header levels the way they are?
[edit]Why are we not using L1 headers at all and why are etymology/pronunciation/POS all L3 instead of POS and one of the others being nested within the third? So why isn't L1 language, L2 pronunciation and etymology/POS L3 or even POS nested as L4 beneath etymology? Korn [kʰũːɘ̃n] (talk) 11:32, 2 April 2018 (UTC)
- If I understand this correctly an L1 header like =Norwegian Nynorsk= would be far too big, I tried it. ==Norwegian Nynorsk== is more acceptable. DonnanZ (talk) 11:46, 2 April 2018 (UTC)
- A more general question about headers: Why are always sized by level? I can understand why the sizing makes some sense from the top of an L2 section. But, IMO. headings appearing after the main lexical content like "References", "Further reading", "Anagrams", etc. don't merit L3 heading size. In addition, "Alternative forms", "Pronunciation", and, to a lesser extent, "Etymology" don't merit the font size we use. Couldn't the structuring function sometimes served by "Etymology" (and less often "Pronunciation") be supported by some means other than heading size? DCDuring (talk) 14:50, 2 April 2018 (UTC)
- Here as at Wikipedia we don't use L1 headers because the page name itself is already the L1 header. —Mahāgaja (formerly Angr) · talk 17:07, 2 April 2018 (UTC)
- A more general question about headers: Why are always sized by level? I can understand why the sizing makes some sense from the top of an L2 section. But, IMO. headings appearing after the main lexical content like "References", "Further reading", "Anagrams", etc. don't merit L3 heading size. In addition, "Alternative forms", "Pronunciation", and, to a lesser extent, "Etymology" don't merit the font size we use. Couldn't the structuring function sometimes served by "Etymology" (and less often "Pronunciation") be supported by some means other than heading size? DCDuring (talk) 14:50, 2 April 2018 (UTC)
- Page titles are already L1 headers. Many pages don't even have etymologies or pronunciations which would mean we are putting everything under a blank "Etymology" header- consider the millions of inflected entries. Furthermore you should think about how multiple etymologies will interact with multiple pronunciations. DTLHS (talk) 18:20, 2 April 2018 (UTC)
- Ah, page titles. I would assume multiple etymologies will interact with multiple pronunciations the same way as now: One of them comes on top, then the rest gets sorted in below it, then the next one comes on top... Korn [kʰũːɘ̃n] (talk) 09:31, 3 April 2018 (UTC)
- The interaction of Etymology and Pronunciation headers was a long-standing issue between User:EncycloPetey ("EP") and the late Robert Ullmann. EP's concern was, for the Latin entries he was interested in, sometimes it made more sense (for him, at least) for PoSes to be organized first by pronunciation, then by etymology. As a result Latin entries were excluded from the operation of one of Ullmann's bots that attempted to ensure that ELE header rules were followed. EP never came up with a counterproposal to the ELE approach that gives Etymology priority. DCDuring (talk) 12:41, 3 April 2018 (UTC)
- Because we could never come up with an alternative to Etymology-first structure, see an entry like palma#Latin for a simply-complicated page where the interaction of Etymology and Pronunciation creates issues. Two etymologies, each of which have two pronunciations, where the pronunciations are tied to specific inflected forms. This same situation of two different pronunciations of the same spelling, tied to the same etymology, but applied only to specific inflected forms occurs in nearly every regular Latin verb, as well as the ablative endings of nouns (and adjectives) as evinced by palma. So, because Etymology has priority, we have to use two different Pronunciation sections under each Etymology section. --EncycloPetey (talk) 14:38, 3 April 2018 (UTC)
- In cases like that I prefer to list all of the pronunciations under a single Pronunciation header, labeled appropriately, e.g. at briseadh#Irish. —Mahāgaja (formerly Angr) · talk 15:23, 3 April 2018 (UTC)
- Because we could never come up with an alternative to Etymology-first structure, see an entry like palma#Latin for a simply-complicated page where the interaction of Etymology and Pronunciation creates issues. Two etymologies, each of which have two pronunciations, where the pronunciations are tied to specific inflected forms. This same situation of two different pronunciations of the same spelling, tied to the same etymology, but applied only to specific inflected forms occurs in nearly every regular Latin verb, as well as the ablative endings of nouns (and adjectives) as evinced by palma. So, because Etymology has priority, we have to use two different Pronunciation sections under each Etymology section. --EncycloPetey (talk) 14:38, 3 April 2018 (UTC)
- The interaction of Etymology and Pronunciation headers was a long-standing issue between User:EncycloPetey ("EP") and the late Robert Ullmann. EP's concern was, for the Latin entries he was interested in, sometimes it made more sense (for him, at least) for PoSes to be organized first by pronunciation, then by etymology. As a result Latin entries were excluded from the operation of one of Ullmann's bots that attempted to ensure that ELE header rules were followed. EP never came up with a counterproposal to the ELE approach that gives Etymology priority. DCDuring (talk) 12:41, 3 April 2018 (UTC)
- Ah, page titles. I would assume multiple etymologies will interact with multiple pronunciations the same way as now: One of them comes on top, then the rest gets sorted in below it, then the next one comes on top... Korn [kʰũːɘ̃n] (talk) 09:31, 3 April 2018 (UTC)
- Etymology doesn't get priority when a certain bot places Alternative forms above it, which is why I am beginning to treat Alternative forms as L4. That way the bot leaves them alone. DonnanZ (talk) 23:29, 3 April 2018 (UTC)
- I always held the view that pronunciation should per default be the first header by which things are sorted after spelling unless there is strong reason to do otherwise, which can be argued for briseadh, effectively making it a case-by-case issue. Of course this is a problem for consistency. and while as an editor I understand why it is done the way it is, as a user I think that the pronunciation section of briseadh is an monstrosity. Why are the verbal pronunciations put above the noun and not above the verb? Korn [kʰũːɘ̃n] (talk) 00:10, 4 April 2018 (UTC)
- @Korn: The way I see it, the Pronunciation section of briseadh applies to the entire Irish entry and not just to the POS following it. Would you prefer it if it looked like this? I can see a case could be made for it, but it also seems a bit like overkill. —Mahāgaja (formerly Angr) · talk 12:08, 4 April 2018 (UTC)
- The screen-filling pronunciation section at [[briseadh]] would seem a perfect use of a show-hide bar for the entire section. DCDuring (talk) 12:15, 4 April 2018 (UTC)
- Wow, I thought I'd consider this overkill too, but now that I see it, yes, yes I would prefer if it looked like this. A lot less scrolling around in the page at the cost of some redundancy. Definitely the user experience I prefer. Korn [kʰũːɘ̃n] (talk) 12:16, 4 April 2018 (UTC)
- @Korn: The way I see it, the Pronunciation section of briseadh applies to the entire Irish entry and not just to the POS following it. Would you prefer it if it looked like this? I can see a case could be made for it, but it also seems a bit like overkill. —Mahāgaja (formerly Angr) · talk 12:08, 4 April 2018 (UTC)
- I always held the view that pronunciation should per default be the first header by which things are sorted after spelling unless there is strong reason to do otherwise, which can be argued for briseadh, effectively making it a case-by-case issue. Of course this is a problem for consistency. and while as an editor I understand why it is done the way it is, as a user I think that the pronunciation section of briseadh is an monstrosity. Why are the verbal pronunciations put above the noun and not above the verb? Korn [kʰũːɘ̃n] (talk) 00:10, 4 April 2018 (UTC)
- Etymology doesn't get priority when a certain bot places Alternative forms above it, which is why I am beginning to treat Alternative forms as L4. That way the bot leaves them alone. DonnanZ (talk) 23:29, 3 April 2018 (UTC)
This L2 header appears in ~174 entries, apparently primarily added by User:Mar vin kaiser. Perhaps the language should also have an entry? - Amgine/ t·e 04:39, 3 April 2018 (UTC)
- Yes, it should. It has no entry in the English Wikipedia, but it does have one in Ethnologue. I'll have a go. SemperBlotto (talk) 05:37, 3 April 2018 (UTC)
Requesting rollback
[edit]Hi. I am trusted here and I frequently look at Special:RecentChanges and undo vandalism. Therefore, I would like to request the rollback right. Inner Focus (talk) 08:58, 3 April 2018 (UTC)
- You have like 190 edits here and you're blocked as a sock on enwiki. —AryamanA (मुझसे बात करें • योगदान) 10:39, 3 April 2018 (UTC)
News from French Wiktionary
[edit]Hello!
March issue of Wiktionary Actualités just came out in English!
An incredible issue of Actualités just fall on Wiktionary with two articles about words, some words about Wiki Indaba, an tremendous dictionary that will change the world, stats and news as usual.
This issue was written by six people and was translated for you by Pamputt. This translation may be improved by readers (wiki-spirit). We still receive zero money for this publication and your comments are welcome. You can also registered to be notice on your talk page. Noé 17:04, 3 April 2018 (UTC)
Should anons' editing rights be restricted?
[edit]There are at least two topics in Wiktionary that are especially vulnerable for speculation: 1) reconstructed entries and 2) etymologies. I have noted that there are anon users who show perhaps too much interest in these, adding fantastic theories about whatever, often based only on phonetic similitude between two words. This has lead me to thinking that reconstructed entries and, if technically feasible, also the etymology -sections of mainspace articles should be the domain of registered users only. It takes the time of several wise men to check the work of one fool, who can use a changing array of IP-addresses. I'd like to invite discussion about this topic. --Hekaheka (talk) 12:24, 4 April 2018 (UTC)
- There are ways to protect reconstructed entries, but there is no way to protect only sections of pages (without a significant overhaul in how we structure our pages anyhow). As to whether we should, I don't think so. Those who patrol pages should perhaps flag changes to etymologies in some way for further review. - TheDaveRoss 12:57, 4 April 2018 (UTC)
- As one of those people who patrol, etymology is a bright line I avoid touching due to the likelihood of on-wiki drama. I agree it should not be separated to a sub-page or namespace (or further abuse filter abuse,) but I also disagree it should be made off-limits. Poor anon contributions are often a symptom of future good contributions, and other negative personal outcomes. - Amgine/ t·e 19:40, 4 April 2018 (UTC)
- IMO it's important that people can do stuff without signing up. I remember the 1990s Internet where you could roam free and comment here, chat there, and never give anyone a name, or have to invent YET ANOTHER stupid password. Okay, now we have to deal with a massive influx of millions of stupid children, but requiring an account is close to having a paywall; and wikis are supposed to be open. Unless we're seeing 98% bad edits from IPs I think it's a very bad step to punish them proleptically. Equinox ◑ 02:30, 5 April 2018 (UTC)
- To address Heka's comment more directly: we do have the "unpatrolled" flag on entries until someone looks at them. If anything, the problem might be that we don't have enough patrollers (whether actual admins don't bother, or we don't speak the right languages, or we don't have enough admin users). Equinox ◑ 02:31, 5 April 2018 (UTC)
- We have plenty of admins (most of them seem to find actually using the admin tools distasteful). DTLHS (talk) 02:36, 5 April 2018 (UTC)
- Perhaps this is a Finnish-only problem, then. We have a very limited supply of admins who are capable of patrolling etymologies (I'm not admin, nor knowledgeable enough on history of words), not to even mention the reconstruction pages. To put it straight, I'm afraid that a considerable portion of our pro-fin reconstructions may be bullshit. --Hekaheka (talk) 12:43, 5 April 2018 (UTC)
- I share some of Heka's concerns, especially taking into account various sockpuppets of formerly blocked uses enjoying the hide-and-seek game and challenging our rules hospitable to anons. When they get blocked for bypassing the blocks, they cry about censorship. I don't know what tools admins may have when they are outnumbered by careless or even hostile editors but I would suggest we need something to bulk-undo edits of editors identified as unreliable, especially if they have been warned several times or are known to be sockpuppets of formerly blocked users. We already have Special:Nuke, we need something to mass undo edits by user name/IP. --Anatoli T. (обсудить/вклад) 13:31, 5 April 2018 (UTC)
- But since I come with a different IP each time and most often you don’t detect my edits (and when you do I afterwards re-edit them to your despair): tough luck. — This unsigned comment was added by 94.124.194.167 (talk).
- [This comment above is left by one of numerous User:Gfarnab's sock-puppets, a serial multiple account abuser under impression he is winning] --Anatoli T. (обсудить/вклад) 12:54, 13 April 2018 (UTC)
- But since I come with a different IP each time and most often you don’t detect my edits (and when you do I afterwards re-edit them to your despair): tough luck. — This unsigned comment was added by 94.124.194.167 (talk).
- While I am surprised that Heka is not an admin, they are a patroller and rollbacker, so they have the relevant tools to help. It might be the case that abuse filters could be made to flag entries specific to Finnish etymologies for further review. - TheDaveRoss 14:54, 5 April 2018 (UTC)
- There has definitely been a recent influx of anons adding speculative or just plain wrong etymological info to mostly Finnish and Proto-Finnic entries, building up faster than I at least can possibly keep up with, most of which also smells like the work of one dedicated person (recent examples include 213.216.248.45, 109.240.25.153, 188.238.130.124, 188.238.169.74). I can think of a few remedies:
- indeed ban anon editing of reconstructions (but this seems unlikely to have a major effect, since mainspace etymologies would remain for editing);
- make sure our mainspace etymologies are sourced, and keep a close eye on (auto-flag?) any edits that remove sources or add unsourced information;
- add something like Appendix:False cognates where anonymous observations on etymology are welcome (the four anons above have been adding "X is false cognate with Y" rather liberally around, even though this is usually irrelevant for the actual etymology)
- But I fear that eventually we may have to abandon maintaining reliable etymologies on Wiktionary altogether, for smaller languages with fewer dedicated editors at least. Etymology is a much more academic discipline than general lexicography, that requires more background knowledge and caution. As our coverage grows, more and more knowledgeable editors are needed to keep track of all the etymological information we have already, and to prevent decay over time. Unlike casual drive-by vandals, amateur etymologists are often also quite dedicated to pushing their views.
- — FWIW I am currently working in a project to establish an online repository (a closed wiki, in fact) of proper academic research on Finnish and general Uralic etymology, so I expect that the amount of time and energy that I can spend on patrolling Wiktionary's etymology coverage is not going to be increasing in the future. --Tropylium (talk) 15:11, 5 April 2018 (UTC)
- There has definitely been a recent influx of anons adding speculative or just plain wrong etymological info to mostly Finnish and Proto-Finnic entries, building up faster than I at least can possibly keep up with, most of which also smells like the work of one dedicated person (recent examples include 213.216.248.45, 109.240.25.153, 188.238.130.124, 188.238.169.74). I can think of a few remedies:
- Personally, I don't like the idea of banning anons from editing etymologies, since I probably edited hundreds of etymologies in the months before I finally decided to make an account. It might be useful to have a way of keeping track of anon edits specifically to etymology and reconstructed templates, like on a page where editors can check them off once they've reviewed them. I have no idea how common these edits are, though, so something like that may be an unreasonably labor-intensive task. —Globins 03:29, 14 April 2018 (UTC)
April LexiSession: mining
[edit]This month is mine! Not mine as if one possessed it, but mine as in mining, exploitation of minerals. The reason behind this theme is that in the French revolutionary calendar, April was renamed Germinal, and it is also the title of a book by Emile Zola about miners. So, mines!
By the way, LexiSession in short: a collaborative transwiktionary experiment. You're invited to participate however you like and to suggest next month's topic. The idea is to look at other community improvements on the same topic to improve our own pages and learn foreign way of contributing. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession (like Lingo Bingo Dingo did last month, thanks to him!). If you can spread the word to other Wiktionaries, you are welcome to do so. Ideally, LexiSession should be a booster for every Wiktionary on the same page, but it depend on the people, and I am still volunteer in this project, so with limited time to disseminate the message Noé 10:14, 5 April 2018 (UTC)
Play a game again
[edit]I shall become Gamesmaster again. We can have another round of a version of a multilingual word game played on a 15 by 15 board. We're gonna start Wiktionary:Random Competition 2018 next week. --Cien pies 6 (talk) 11:30, 11 April 2018 (UTC)
- This coming Monday will be when the first entries are placed on the board. --Cien pies 6 (talk) 11:18, 13 April 2018 (UTC)
Documentation page about interwikilinks
[edit]Hello,
For your information, I updated the documentation page about interwikilinks for Wiktionary. Feel free to reuse its content for your own documentation, and let me know if I forgot something important.
Cheers, Lea Lacroix (WMDE) (talk) 13:54, 12 April 2018 (UTC)
A live about contribute on French Wiktionary in few hours : https://www.youtube.com/c/LyokoïKun/live
[edit]Hello all ! I try a new format on my YouTube channel : a live where I contribute on french Wiktionary. This live is in french, but I can respond in english. Maybe you want to make the same for the english Wiktionary ! :D --Lyokoï (talk) 15:57, 12 April 2018 (UTC)
- That looks fun. I could certainly go for that one day - I'd be wearing a mask and red underpants and with a voice distorter of course to hide my identity. --Cien pies 6 (talk) 11:14, 13 April 2018 (UTC)
Removing text on learned borrowing and doublet
[edit]I think we should remove the text on Template:learned borrowing and Template:doublet so that they're consistent with Template:borrowing. I end up writing notext=1
to get rid of the text almost every time I use these, and it would be more convenient to just have no text as the default and be able to add the text when it's necessary. —Globins 03:15, 14 April 2018 (UTC)
- Support removing the lead text of
{{learned borrowing}}
(and that of{{calque}}
and{{semantic loan}}
), abstain for{{doublet}}
. --Per utramque cavernam (talk) 11:58, 15 April 2018 (UTC) - I disagree / tend to oppose this. How are you using
{{doublet}}
that you're having to suppress the text? Are you spelling out the text "doublet of" (in which case, let the template do it), or adding doublets to lists of cognates (which seems undesirable)? Likewise, with{{calque}}
and{{semantic loan}}
, it seems desirable to spell out that it's a calque (etc), whereas with mere "borrowing" vs "inheritance" one doesn't need the text IMO because the result is the same (word foo was adopted, and potentially adapted as e.g. fu) and it's almost always obvious whether what happened was borrowing or descent based on whether the receiving language is descended from the giving language. In practice, "learned borrowing" seems to be poorly distinguished (especially in English entries) from mere borrowing (think of Latinate words variously labelled "borrowings" or "learned borrowings"); I'm on the fence about whether its text should be removed. - -sche (discuss) 16:31, 15 April 2018 (UTC)- @-sche, I agree that adding doublets to lists of cognates seems undesirable, but based on the pages I've edited it seems to be a common practice. Often, doublets are somewhere within a list of cognates and use
{{cog}}
instead of{{doublet}}
. Actually, now that I'm writing this out, I realize that it probably makes more sense to use another sentence for doublets and keep the text, and that I got the idea that the text should be suppressed because I made some edit when I was new to Wiktionary where I added a new sentence for a doublet and it was reverted. I oppose what I said earlier about doublets, as well as removing the text from{{calque}}
and{{semantic loan}}
. These are uncommon enough that I think it's preferable to specifically mention them with text when they occur. For learned borrowings, I think the lack of distinction between them and normal borrowings gives more validity to the argument that the text should be removed, since the text could potentially mislabel a term when we could just avoid saying something wrong by writing "from." —Globins 21:30, 15 April 2018 (UTC)- To be clear, I agree that "calque" and "semantic loan" should always be spelled out explicitly. I just don't think we need this pesky lead text to do it. --Per utramque cavernam (talk) 13:24, 16 April 2018 (UTC)
- I doubt that editors will reliably spell the text out themselves, though, if the template doesn't provide it. I also don't see why we should make them, when the template can do it. - -sche (discuss) 20:01, 19 April 2018 (UTC)
- To be clear, I agree that "calque" and "semantic loan" should always be spelled out explicitly. I just don't think we need this pesky lead text to do it. --Per utramque cavernam (talk) 13:24, 16 April 2018 (UTC)
- @-sche, I agree that adding doublets to lists of cognates seems undesirable, but based on the pages I've edited it seems to be a common practice. Often, doublets are somewhere within a list of cognates and use
Use of Template:cognate
[edit]Is this template only intended for cognates, or should it be used when you just want to link to the Wikipedia page for a language? —Globins 04:11, 16 April 2018 (UTC)
- Technically there is no difference in output (IIRC), but if you aren't comfortable with using a template called
{{cog}}
, you can use{{noncog}}
. —Suzukaze-c◆◆ 04:12, 16 April 2018 (UTC)
- I always use
{{cog}}
. --WikiTiki89 13:18, 16 April 2018 (UTC)
- I always use
- As do I. Introducing a difference between the two would probably require manual cleanup, then.__Gamren (talk) 17:30, 19 April 2018 (UTC)
- At the moment, it is only a difference in template name. At Module:etymology/templates, noncognate is a redirect to cognate. —Suzukaze-c◆◆ 18:53, 19 April 2018 (UTC)
- Guys, don't forget that we also have
{{m+}}
. --Per utramque cavernam (talk) 18:54, 19 April 2018 (UTC)
- Guys, don't forget that we also have
- Any template will be used by editors anywhere, regardless of its intended purpose, if it produces their desired formatting result. This is the case no matter how good the documentation is or how explicit it is about where the template should be used. I regularly clean up uses of
{{t}}
outside of translation sections. The only solution is either some software change (to display a prominent error message if the template is used in the wrong context, not currently possible), or vigilance to manually clean up other editors' carelessness. DTLHS (talk) 19:03, 19 April 2018 (UTC)
- Any template will be used by editors anywhere, regardless of its intended purpose, if it produces their desired formatting result. This is the case no matter how good the documentation is or how explicit it is about where the template should be used. I regularly clean up uses of
- However, in relation to
{{cog}}
, it was originally meant to be a general template that combined{{etyl|lang|-}}
with{{m|lang|word}}
. It's only named "cognate" because that's its most common use. The only reason{{noncog}}
exists is because some stubborn editors couldn't bring themselves to use a template named "cognate" for something that wasn't a cognate. --WikiTiki89 19:54, 19 April 2018 (UTC)
- However, in relation to
Typo Team project 'Moss' has been updated
[edit]w:Wikipedia:Typo Team/moss, a project first introduced to us in 2015 which finds words used on Wikipedia that do not have Wiktionary entries, has been updated. The words in this list are generally either valid words that we're missing and could add, or (more often) typos in the Wikipedia articles which could be fixed. - -sche (discuss) 17:11, 16 April 2018 (UTC)
Disallowing Appendix-only constructed languages
[edit]As a follow-up to this. Per that vote, all Lojban entries were moved to Appendix on the grounds that most of them would likely not satisfy our criteria for inclusion and thus be deleted. The situations of the other languages in the category are presumably similar. But Appendix is still a "dictionary namespace", i.e. it is a part of the site that is presented to our readers. As such, we are obliged to ensure the accuracy of the information contained therein, which can only meaningfully be done by the same means that we prove accuracy in the mainspace: attestation. Really, delegating this content to appendix only makes it somewhat more troublesome to access and less likely for someone to stumble upon it; it doesn't solve the problem that we are being used as a platform for distributing unverifiable information. I hope I will be excused for speculating that that course of action was favoured over entirely expunging the content out of a misguided desire to appease.
I am not opposed to allowing terms satisfying the WDL criteria being included in mainspace, even if the number of such terms would be small; after all, such is also the situation of some natural extinct languages. It would probably be useless for users, but academically legitimate. I do, however, oppose allowing constructed languages being allowed LDL status.__Gamren (talk) 17:27, 19 April 2018 (UTC)
- You claim that "we are obliged to ensure the accuracy of the information contained therein [in the Appendix], which can only meaningfully be done by the same means that we prove accuracy in the mainspace: attestation". This is false. See WT:LOP for a time-honoured appendix page explicitly designed to hold terms that cannot be attested. —Μετάknowledgediscuss/deeds 18:40, 19 April 2018 (UTC)
- Those pages inform the reader that the content is protologisms, which makes it more acceptable. However, the text also seems to invite editors to add their own invented words? That doesn't seem appropriate.__Gamren (talk) 18:51, 19 April 2018 (UTC)
- That is what we have done for many years. You are welcome to suggest changing our practices, but you should understand them first. —Μετάknowledgediscuss/deeds 19:46, 19 April 2018 (UTC)
- Those pages inform the reader that the content is protologisms, which makes it more acceptable. However, the text also seems to invite editors to add their own invented words? That doesn't seem appropriate.__Gamren (talk) 18:51, 19 April 2018 (UTC)
- I think that for appendix-only constructed languages the burden is (in practice) "a use or mention"; sometimes we've seemed to take (or accept confirmation of) a few vocab words from even mere websites of official government bodies. If there were concern than a Lojban word was attested in only a single non-authoritative book and might be a nonce, it could be labelled as such; if a word were only "attested" in e.g. an official dictionary, the only concern I would have about including it would pertain to copyright; I don't think there would be a danger that "the authorities made up the word and it isn't in use", since that describes almost the entire language and is the reason it exists in appendix-space and not the mainspace. If the word is not attested anywhere at all, then obviously we shouldn't have an entry for it. - -sche (discuss) 19:59, 19 April 2018 (UTC)
- Right, and we simply shouldn't have made-up words that haven't been used, regardless of whether anyone thinks the person who made it up is an "authority".__Gamren (talk) 09:19, 20 April 2018 (UTC)
- The appendix space has never been reserved for strictly verifiable information, as Metaknowledge points out above.
- Even ignoring this, practically all information about most of the languages in question is verifiable. The attestation criteria applied to entries in the mainspace, which are a set of workable but somewhat arbitrary conventions, are not the only means by which information or usage can be meaningfully verified. Note that a word can easily be attested and yet not meet WDL CFI. Setting aside constructed languages, a number of other appendices exist in order to contain information that’s verifiable but doesn’t meet the standard WT:CFI criteria — compare Appendix:English dictionary-only terms — which suggests the established usage of the appendix namespace is not in line with the proposed restrictions on that namespace.
- Some (CFI-attestable) words in natural languages are etymologically derived from words in appendix-only constructed languages (e.g. silflay). Deleting the etyma here would be strange, as the word didn’t pop into existence in English, but was derived from a word in another language.
- As far as I’ve seen, the arguments given so far against treating constructed languages with communities of speakers as LDLs have solely been on the grounds that they are invented. I frankly don’t see why that fact should enter into the question at all (assuming they are genuinely used as a means of communication). They seem like a perfect use case for the LDL label: lexicographical information that is straightforwardly verifiable by its attestation in texts, which, for reasons of poor documentation in the types of sources we label ‘durably archived’, doesn’t meet WDL attestation criteria. Words can be coined just as readily in little-documented natural languages as in artificial ones, so this wouldn’t ‘open the floodgates’ for protologisms any more than our current policies do.
- I do think coverage of constructed languages should be limited to those actually used by multiple people as a means of communication; certainly, we shouldn’t cover personal artlangs that only one person will ever use.
- Considerations of what is useful to users of the site are important. This suggestion would make the site less useful to people looking for information on the languages in question. The benefits gained in doing so seem minimal. — Vorziblix (talk · contribs) 12:27, 21 April 2018 (UTC)
- But if we don't care about accuracy at all (and I flat out refuse to accept "it was made up by this or that person" as proof of accuracy; usage matters, origin does not), why not just keep it in mainspace (this is aimed at those who voted support)? Surely that's more "useful".__Gamren (talk) 10:49, 22 April 2018 (UTC)
- The vast majority of the words in question are attested in usage in addition to being mentioned in the source material. For Lojban in particular, which now makes up most of the material being questioned, there’s a corpus of some 7 million tokens in actual usage here. — Vorziblix (talk · contribs) 17:07, 22 April 2018 (UTC)
- Seven million tokens. So, seven to ten times the size of the Bible, 14 times War and Peace, or 140 times a NanoWriMo novel. And that includes everything, even IRC. Assuming the average WP page is at least 500 words, we could get that for 150 languages just by dumping WP (all namespaces).--Prosfilaes (talk) 00:14, 23 April 2018 (UTC)
- The point isn’t that the corpus is extremely large, but that it’s large enough to verify whether words are in use and how they are used. — Vorziblix (talk · contribs) 01:47, 23 April 2018 (UTC)
- Seven million tokens. So, seven to ten times the size of the Bible, 14 times War and Peace, or 140 times a NanoWriMo novel. And that includes everything, even IRC. Assuming the average WP page is at least 500 words, we could get that for 150 languages just by dumping WP (all namespaces).--Prosfilaes (talk) 00:14, 23 April 2018 (UTC)
- The vast majority of the words in question are attested in usage in addition to being mentioned in the source material. For Lojban in particular, which now makes up most of the material being questioned, there’s a corpus of some 7 million tokens in actual usage here. — Vorziblix (talk · contribs) 17:07, 22 April 2018 (UTC)
- Constructed languages are just plan less interesting than natural languages. Knowing what a word is in Nahautl can tell you something about ancient history, about the evolution of languages. Knowing what a word is in Lojban tells you nothing about history. Yes, we could copy a dictionary in, but that doesn't add value besides just keeping a copy of the dictionary. CFI means that for major modern languages, every word we have an entry on, in theory someone in the future could find it in a text and come looking for a definition. With the exception of a few conlangs, all of which are in mainspace, they don't have that; nothing has been written in the language besides a few didactic texts. To get the equivalent of a tiny library, your Lojban corpus tossed in the ephemeral IRC logs.--Prosfilaes (talk) 00:32, 23 April 2018 (UTC)
- Indeed, they are less linguistically interesting than natural languages, which doesn’t mean information about them is useless. Nor are the existing Lojban texts all ‘didactic’; they include original compositions of prose and poetry as well, and anyone finding words in any texts and wanting a definition would be well served by a dictionary. (The same is true for at least some of the other conlangs in appendix space, though probably not for all of them.) Maintaining such a dictionary can go well beyond just ‘copying a dictionary in’, as we also have attestations to work from. Records of non-literary communication in a language are also valuable and provide evidence of how that language is used by its community of speakers; corpuses of spoken language are not uncommon, for instance. — Vorziblix (talk · contribs) 01:47, 23 April 2018 (UTC)
- But if we don't care about accuracy at all (and I flat out refuse to accept "it was made up by this or that person" as proof of accuracy; usage matters, origin does not), why not just keep it in mainspace (this is aimed at those who voted support)? Surely that's more "useful".__Gamren (talk) 10:49, 22 April 2018 (UTC)
- These kinds of languages do need to meet some standard of verification, even if a weakened standard such as "at least one mention", in my view. As for protologisms, Wiktionary:Votes/pl-2013-09/Deleting list of protologisms had no consensus, but perhaps times have changed. --Dan Polansky (talk) 11:16, 22 April 2018 (UTC)
- I fully agree. I would support the introduction of such a standard. — Vorziblix (talk · contribs) 17:07, 22 April 2018 (UTC)
- For LDLs, to quote CFI, "the community of editors for that language should maintain a list of materials deemed appropriate as the only sources for entries based on a single mention". I would assume (and desire, and advocate if there is no consensus about this) that a source should only be allowed to be on such a list if we trust it to be strictly descriptive like ourselves, hence why we wouldn't accept Urban Dictionary as a reference. Plena Ilustrita Vortaro might be considered "authoritative", but it also demonstrably contains lots of made-up words. For Lojban and the other languages, is there really a dictionary that we can trust not to try to "patch holes" in an incompletely developed vocabulary? If so, I guess my arguments for having special rules for conlangs kind of crumble, but I seriously doubt it is the case..__Gamren (talk) 08:09, 30 April 2018 (UTC)
- No, there isn’t a Lojban dictionary that we can trust to be descriptive, but, as noted above, there are corpora showing words in actual use, based on which a descriptive dictionary can straightforwardly be compiled. At least some of the other languages (Quenya, Sindarin, Klingon, ...) also have published texts in which terms are used in context, from which a descriptive dictionary (albeit a much sparser one) can also be compiled. — Vorziblix (talk · contribs) 15:58, 30 April 2018 (UTC)
- @Vorziblix Then I guess I don't really mind making Lojban an LDL after all. Do you have a rough sense of how many of our current entries would be citable using this korpora zei sisku?__Gamren (talk) 06:37, 3 May 2018 (UTC)
- @Gamren: To test this, I randomly sampled 50 entries out of Category:Lojban lemmas and manually checked for their attestation in the corpora. (I excluded the Tatoeba corpus from consideration because it consists of isolated example sentences.) Here’s the list of all sampled entries:
- @Vorziblix Then I guess I don't really mind making Lojban an LDL after all. Do you have a rough sense of how many of our current entries would be citable using this korpora zei sisku?__Gamren (talk) 06:37, 3 May 2018 (UTC)
- No, there isn’t a Lojban dictionary that we can trust to be descriptive, but, as noted above, there are corpora showing words in actual use, based on which a descriptive dictionary can straightforwardly be compiled. At least some of the other languages (Quenya, Sindarin, Klingon, ...) also have published texts in which terms are used in context, from which a descriptive dictionary (albeit a much sparser one) can also be compiled. — Vorziblix (talk · contribs) 15:58, 30 April 2018 (UTC)
- For LDLs, to quote CFI, "the community of editors for that language should maintain a list of materials deemed appropriate as the only sources for entries based on a single mention". I would assume (and desire, and advocate if there is no consensus about this) that a source should only be allowed to be on such a list if we trust it to be strictly descriptive like ourselves, hence why we wouldn't accept Urban Dictionary as a reference. Plena Ilustrita Vortaro might be considered "authoritative", but it also demonstrably contains lots of made-up words. For Lojban and the other languages, is there really a dictionary that we can trust not to try to "patch holes" in an incompletely developed vocabulary? If so, I guess my arguments for having special rules for conlangs kind of crumble, but I seriously doubt it is the case..__Gamren (talk) 08:09, 30 April 2018 (UTC)
- I fully agree. I would support the introduction of such a standard. — Vorziblix (talk · contribs) 17:07, 22 April 2018 (UTC)
- Appendix:Lojban/joi
- Appendix:Lojban/linji
- Appendix:Lojban/cipra
- Appendix:Lojban/jgitrgitara
- Appendix:Lojban/alminiu
- Appendix:Lojban/cisma
- Appendix:Lojban/mikri
- Appendix:Lojban/dunli
- Appendix:Lojban/zgike
- Appendix:Lojban/bangu
- Appendix:Lojban/kilto
- Appendix:Lojban/xebe'i
- Appendix:Lojban/dimna
- Appendix:Lojban/boxfo
- Appendix:Lojban/mau
- Appendix:Lojban/fange
- Appendix:Lojban/dasni
- Appendix:Lojban/lu'o
- Appendix:Lojban/burcu
- Appendix:Lojban/cinje
- Appendix:Lojban/fanri
- Appendix:Lojban/gocti
- Appendix:Lojban/vorme
- Appendix:Lojban/vajni
- Appendix:Lojban/.y'y
- Appendix:Lojban/renro
- Appendix:Lojban/cliva
- Appendix:Lojban/mexygu'e
- Appendix:Lojban/cifnu
- Appendix:Lojban/zo'o
- Appendix:Lojban/fenso
- Appendix:Lojban/zi
- Appendix:Lojban/carmi
- Appendix:Lojban/palne
- Appendix:Lojban/mu'a
- Appendix:Lojban/sadjo
- Appendix:Lojban/benpi'a
- Appendix:Lojban/me
- Appendix:Lojban/kliru
- Appendix:Lojban/vidni
- Appendix:Lojban/kanla
- Appendix:Lojban/dzitricu
- Appendix:Lojban/jbixa'u
- Appendix:Lojban/prulamdei
- Appendix:Lojban/ru'e
- Appendix:Lojban/marna
- Appendix:Lojban/fo'e
- Appendix:Lojban/degja'i
- Appendix:Lojban/nutli
- Appendix:Lojban/mu'u
- And here are the results:
- 2 words were unattested (Appendix:Lojban/benpi'a and Appendix:Lojban/xebe'i).
- 1 word was attested only once (Appendix:Lojban/alminiu).
- 47 words were attested at least three times (all the rest). Most of these had a few hundred attestations.
- So, as a rough estimate, about 94% of the words would be citable. — Vorziblix (talk · contribs) 19:02, 4 May 2018 (UTC)
- And here are the results:
- And I still don't get the point of having entries in appendices! We can only either include something or not, and if we are including we might as well do so in the mainspace.__Gamren (talk) 08:27, 30 April 2018 (UTC)
- Unless this discussion suddenly jumpstarts, I'll probably start a vote later today.__Gamren (talk) 08:56, 30 April 2018 (UTC)
In the news - Swaziland has been renamed to eSwatini. --Anatoli T. (обсудить/вклад) _eSwatini" class="ext-discussiontools-init-timestamplink">05:53, 20 April 2018 (UTC)
- That is definitely a hot word. And there's no guarantee that English speakers will start to use it over Swaziland. DTLHS (talk) 06:06, 20 April 2018 (UTC)
- I don't deny it but we should reflect the official announcements for country names, besides, the etymology is old and it's allegedly an older name for the country. --Anatoli T. (обсудить/вклад) 06:10, 20 April 2018 (UTC)
- When did people start using the Beer Parlour to talk about anything at all? We have the Tea Room to discuss words. --WikiTiki89 14:55, 20 April 2018 (UTC)
- Maybe the lang name should be changed, from "Swazi" as e.g. in siSwati to "siSwati" like the country's name did or might change. That would be a BP matter. If OP's post was a question like "should cats like Category:en:Swaziland be moved to the new name?" it would also be a BP matter. Question like "should the translations be moved from one entry to the other?" could be a BP matter (if asked rather general) or TR matter (if asked for just these two country names). If it was a question like "does the new name deserve an entry, because it's the official name, even if not attested?" it would also be a BP matter as it would require a change of WT:CFI -- but the answer should be "no": if not attested, then no entry. A usage note in Swaziland however could have been ok. -84.161.32.86 13:58, 30 April 2018 (UTC)
- It's absolutely too soon to be changing anything. All we did is create an entry for the words, we're continuing to call the country and the language and everything else by the old names until it becomes clear that they are no longer the most common terms. That would take years, if it ever happens at all. --WikiTiki89 14:09, 30 April 2018 (UTC)
- I know. I just wrote how this could be a BP matter . I didn't propose to change any Swazi term (as I don't really care about it). -84.161.32.86 14:22, 30 April 2018 (UTC)
- It's absolutely too soon to be changing anything. All we did is create an entry for the words, we're continuing to call the country and the language and everything else by the old names until it becomes clear that they are no longer the most common terms. That would take years, if it ever happens at all. --WikiTiki89 14:09, 30 April 2018 (UTC)
- Maybe the lang name should be changed, from "Swazi" as e.g. in siSwati to "siSwati" like the country's name did or might change. That would be a BP matter. If OP's post was a question like "should cats like Category:en:Swaziland be moved to the new name?" it would also be a BP matter. Question like "should the translations be moved from one entry to the other?" could be a BP matter (if asked rather general) or TR matter (if asked for just these two country names). If it was a question like "does the new name deserve an entry, because it's the official name, even if not attested?" it would also be a BP matter as it would require a change of WT:CFI -- but the answer should be "no": if not attested, then no entry. A usage note in Swaziland however could have been ok. -84.161.32.86 13:58, 30 April 2018 (UTC)
Vote: CFI and images
[edit]FYI, I created Wiktionary:Votes/pl-2018-04/CFI and images.
Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 09:07, 21 April 2018 (UTC)
The Beer Barlour is a Discussion page and is categorised as such. A beer parlor is a kind of bar, but wasn't categorised as such. So, I added Category:en:Bars to a few pages, but I suck at categories so didn't make it, and probably people would prefer it named "Drinking establishments" or something like that, to avoid anyone putting [[crowbar]] therein. We can probably think of at least 100 kinds of bars, anyway. --Cien pies 6 (talk) 19:02, 21 April 2018 (UTC)
Tell us what you think about the automatic links for Wiktionary
[edit]Hello all,
One year ago, the Wikidata team started deploying new automatic interwiki links for Wiktionaries. Today, the links for the main namespace are automatically displayed by a Mediawiki extension, and the links for other namespaces are stored in Wikidata. You can find the documentation here.
We would like to know if you encountered problems with the system, if you would have suggestions for further improvements. This could be for example:
- Some automatic links don’t work as expected
- Some problems you encountered with entering links (for non-main namespace) in Wikidata
- Some new features you’d like to have, related to links
To give feedback, you have two options:
- Let a message on this talk page
- Let a message here. If you do so, please mention me with the {{ping}} template, so I can get a notification.
I’m looking forward for your feedback! Lea Lacroix (WMDE) (talk) 10:16, 24 April 2018 (UTC)
- @Lea Lacroix (WMDE) Feature request: Provide a LUA or Parser function to query Cognate database (T163734). – Jberkel 10:38, 24 April 2018 (UTC)
- Indeed useful. I also support phab:T190210. When I search a page and it does not exist, I click to "create" and "preview" the blank page just to check the interwiki links. If they are available without saving the new page, it would be more useful to show the interwikis directly in the search none-found page and the no article text message. --Vriullop (talk) 18:19, 24 April 2018 (UTC)
- @Lea Lacroix (WMDE) Overall, excellent function. Thank you to all who made this happen. --Dan Polansky (talk) 18:09, 25 April 2018 (UTC)
- @Lea Lacroix (WMDE) I think it's good. We don't need a bot any more, and don't have to deal with human users adding and removing them erroneously. Equinox ◑ 18:12, 25 April 2018 (UTC)
- Not a complaint, but I note that there are some situations where this will not work - in particular, where projects have parallel appendices with their native-language names. bd2412 T 20:01, 25 April 2018 (UTC)
- @BD2412: Don't worry, it's only meant to be enabled in the main namespace. --WikiTiki89 23:38, 25 April 2018 (UTC)
- It would be useful to have Wikidata items in certain other spaces, though. bd2412 T 00:35, 26 April 2018 (UTC)
- @BD2412: Don't worry, it's only meant to be enabled in the main namespace. --WikiTiki89 23:38, 25 April 2018 (UTC)
- Not a complaint, but I note that there are some situations where this will not work - in particular, where projects have parallel appendices with their native-language names. bd2412 T 20:01, 25 April 2018 (UTC)
Long ſ in quotes
[edit]When quoting a book that uses long ſ, should it be reproduced in the quote? @Aabull2016 and I have been discussing this (brief background: I added it to some quotes and they reverted me) and we've been unable to come to an agreement. WT:" says using it is "optional," but I feel like we need a more concrete policy on this. Nloveladyallen (talk) 20:47, 26 April 2018 (UTC)
- I have no problem with the choices other contributors make on this issue. However, unless a clear blanket policy is established, I'd prefer not to have my contributions edited and links changed merely to display certain 17th or 18th century typographical features, as I do give careful consideration to the ways in which I present and reference quotations, and make every effort to make them of maximum use to the largest number of potential users. Aabull2016 (talk) 17:51, 27 April 2018 (UTC)
- I believe quotations should be reproduced as faithfully as possible. --WikiTiki89 18:19, 27 April 2018 (UTC)
- Then whenever possible in images. Transcribing to text is only faithful in those texts born and reproduced
in Englishdigitally. - If we don't use the long s (or ſ, not the long ſ) in our spellings, it doesn't seem something that we need to preserve in quotations. It's a typographical feature more than a spelling feature.--Prosfilaes (talk) 06:40, 28 April 2018 (UTC)
- I definitely think we should reproduce the original text as faithfully as possible, regardless of whether anyone finds it "off-putting". I see no reason to change what we find into something else, when we have the opportunity not to.__Gamren (talk) 05:17, 30 April 2018 (UTC)
- Again, then we should use images. We abuse so much text--Ovid never wrote anything like the modern casing and punctuation that reneri assigns to him--why should we throw users under the bus to correctly represent the long s versus the s? It wasn't a spelling issue at the time; G.W., who wrote Magazine or Animadversions on the English Spelling in 1701 didn't think about ſ versus s at all. Why we should pervade our quotes with this distinction at all?
- Moreover, if we want to use this distinction, let's fix Shakespeare first. Go through every cite and pin it to an early edition, and be reproduced spelling and long-s exactly. I think it an active detriment if we let Shakespeare be whatever we find in whatever edition, updated as it might be but make unknowns be shouldered by an alienating use of the original typography, as if Shakespeare was somehow more modern then his contemporaries.--Prosfilaes (talk) 20:54, 30 April 2018 (UTC)
- I tried to do exactly that at 'rosemary', 'poison', 'dagger' and 'soundpost'. Kaixinguo~enwiktionary (talk) 21:09, 30 April 2018 (UTC)
- I definitely think we should reproduce the original text as faithfully as possible, regardless of whether anyone finds it "off-putting". I see no reason to change what we find into something else, when we have the opportunity not to.__Gamren (talk) 05:17, 30 April 2018 (UTC)
- Then whenever possible in images. Transcribing to text is only faithful in those texts born and reproduced
- This has been discussed more than once over the years. AFAIK there's no policy mandating or prohibiting the long s, and this thread suggests we aren't likely to agree on one now, so editors can bother to reproduce it, or not, as they choose. In a few entries where the long-ness of the s has influenced the uses/spellings of the word, like windsucker~windfucker, reproducing it seems helpful and so desirable. Otherwise, it has no real benefit. Changing quotations that someone else has added seems like a poor (perhaps rude) use of one's time. - -sche (discuss) 21:09, 30 April 2018 (UTC)
- I find the long s obnoxious, similar to (whoever it was -- ReidAA?) trying to reproduce cited text precisely, including line breaks for word wrap, and long chains of &-nbsp; code to force alignment with spaces. As long as we don't actually re-spell things (like Webster 1913 naughtily citing colour and spelling it color, or such) I don't think it matters. As someone said above, it's typography. Equinox ◑ 21:16, 30 April 2018 (UTC)
- Good to know what I put is obnoxious to you. Kaixinguo~enwiktionary (talk)
- Yes, it's good to have feedback from the community about what they think. Equinox ◑ 21:39, 30 April 2018 (UTC)
Vote: Unifying on Inflection heading
[edit]FYI, I created Wiktionary:Votes/pl-2018-04/Unifying on Inflection heading.
Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 05:49, 28 April 2018 (UTC)
New Latin
[edit]I've added a section about New Latin to the About Latin page to start a discussion. Many New Latin terms are compiled in dictionaries prescriptively, which leads to a problem when those dictionaries are published by the arbiters of New Latin (e.g. the Latinitas Opus Fundatum in Civitate Vaticana) or arguably experienced Latinists (e.g. professors of Latin). So the question is, are these words to be ignored as unattested, as our own guidelines for inclusion insist, or should this rule be relaxed, allowing for inclusion of terms from a carefully curated list of acceptable dictionaries? --Robert.Baruch (talk) 17:07, 28 April 2018 (UTC)
- Wikipedia isn't necessarily correct. It even contradicts itself (and thus also [1]) with en:w:Template:Latin periods: "1500–present New Latin". Also it contradicts the entry New Latin.
As for "On Wiktionary, New Latin is considered to be the same as Contemporary Latin": I don't think so, cp. New Latin. And it's not necessarily correct or stated anywhere as accepted consensus. While Contemporary Latin terms are labelled "New Latin" (e.g. hamaxostichus), New Latin terms aren't necessarily considered to be from Contemporary Latin, i.e. Contemporary Latin might very well be a sub-form of New Latin or be merged into the more general New Latin. - WT:CFI#Number of citations (mentionings aren't accepted except for some sources) + Wiktionary:About Latin#Attestation (some sources accepted for mentionings) makes it clear: As for now, mentionings in modern dictionaries aren't sufficient for attesting Latin terms. Which makes it a non-open-question. However, some terms not sufficient as for WT:CFI (as usages in web pages which aren't durably archived, and single mentioned terms not from Contemporary Latin dictionaries) could be added to Appendix:List of protologisms/non-English#Latin or into an own appendix.
- Contemporary dictionaries (as the "Lexicon Recentis Latinitatis") presumably are under copyright. While it's ok for a normal person to quote a few terms (at least in some countries), it doesn't seem right to use it for wiktionaries entries as in the end WT could have copied all terms out of it.
- Wikipedia isn't necessarily correct. It even contradicts itself (and thus also [1]) with en:w:Template:Latin periods: "1500–present New Latin". Also it contradicts the entry New Latin.
- -84.161.32.86 13:16, 30 April 2018 (UTC)
Italiot Greek
[edit]We use the term Italiot Greek as a synonym of Griko, i.e. the variety of Greek spoken at the tip of the "heel" of Italy. However, at Wikipedia, Italiot Greek is a cover term for both Griko and Calabrian Greek, spoken at the tip of the "toe" of Italy. Intuitively, I feel like the term ought to refer to all Greek lects spoken in Italy, in which case we should probably rename grk-ita
to "Griko language" or something, but does anyone know what the term actually most commonly refers to in the linguistic literature? (It was almost five years ago that we dicussed creating new codes for these two lects at all, but I'm not finding a discussion of what precise names we should use.) —Mahāgaja (formerly Angr) · talk 20:59, 28 April 2018 (UTC)
- Also (pinging @Aearthrise as our primary Italiot editor), if Wikipedia is to be believed, Calabrian is written in the Latin alphabet while Griko is written in the Greek alphabet, but all of our Italiot Greek lemmas are written in the Latin alphabet, making me wonder if they are actually in fact Calabrian rather than Griko. —Mahāgaja (formerly Angr) · talk 21:08, 28 April 2018 (UTC)
- @Mahagaja, Grekànika(Grecanic) and Katoitaliòtika(Κατωιταλιώτικα) are the names given to all the dialects of Greek in Italy: Apulian(from Salento) is called "Griko/Grico", and Calabrian(from Bova) is called "Greko/Greco". Comparatively, Apulian Griko(Γκρίκο) has a wide array of literature, administrative use, and substantial number of speakers, while Calabrian Greko(Γκραίκο) is a rural tongue with a severe lack of literature and administrative use(it is close to becoming extinct).
- All of the lemmata I have added for Italiot Greek are from the Apulian dialect; words in Calabrian are very similar to Apulian: Cal.Avri Apul.Avvri(tomorrow), Cal.Jineca Apul.Jineka(woman), Cal.Discolo Apul.Diskolo(difficult)- most of the vocabulary is the same with minor differences in spelling. Their grammar(from what I know) is mostly(or exactly) the same.
- As for the correct alphabet of the Grecanic dialects, both Greek and Latin scripts are valid. I initially created Italiot Greek as a regional dialect of Greek, but because of the use of the Latin script(the Greek lemma page would begin with Latin words), we decided to use the
grk-ita
code. I have ported all of the Latin script words togrk-ita
; I still need to complete the transfer of the Greek script words. - I don't believe that we should use separate codes for Calabrian and Apulian Greek on wiktionary- glottolog concords with my opinion: it lumps both together as Apulia-Calabrian Greek.
- @Aearthrise: Thanks for that comprehensive explanation. So does anyone object to retiring the code
grk-cal
, which doesn't even have any lemmas, and clarifying thatgrk-ita
is a cover term for both Apulian and Calabrian Greek? We can make two regional dialects for it if we want: CAT:Apulian Greek already exists as a regional variety of CAT:Italiot Greek language, so we would just need to create CAT:Calabrian Greek as the other one. (I find these names much more helpful than, say, Griko and Greko or Grico and Greco!) Also, Aearthrise, would you be willing to start WT:About Italiot Greek just to clarify how we're using the term and everything else you mentioned above? —Mahāgaja (formerly Angr) · talk 06:05, 1 May 2018 (UTC)- If there were some way to have the Latin letters sort after the Greek letters, then I think it would be better not to have a separate language code for Italiot Greek. --WikiTiki89 15:39, 1 May 2018 (UTC)
- I don't know about that. Glancing through a few Italiot lemmas, I find several that are quite different from their Greece-Greek synonyms: ammài is not just a Latin-alphabet spelling of μάτι (máti), while ajarài and πεταλούδα (petaloúda) are completely different words. I don't know how paparasciànni would even be spelled in the Greek alphabet since standard Greece-Greek doesn't have a /ʃ/ sound. I definitely agree with keeping Italiot a distinct language. Incidentally, we should get rid of CAT:Italiot Greek, which is categorized as a regional dialect of Greek, and keep only CAT:Italiot Greek language. —Mahāgaja (formerly Angr) · talk 17:04, 1 May 2018 (UTC)
- So what? Not every Italiot entry needs to have a Greek-alphabet equivalent for them to be treated as the same language. --WikiTiki89 17:15, 1 May 2018 (UTC)
- Keeping Italiot Greek separate would be consistent with what we've done in similar cases, e.g. the recently-discussed Mariupol Greek, and/so it seems reasonable to me, unless speakers themselves were to insist their lect was just a dialect. - -sche (discuss) 17:12, 1 May 2018 (UTC)
- All I was saying is that if the sorting order is only reason, it would be preferable to fix the sorting order rather than split it off. --WikiTiki89 17:15, 1 May 2018 (UTC)
- @Wikitiki89 the sorting order was the reason for creating
grk-ita
(grk-cal
did exist beforehand); at this point, most of the lemmata are under the Italiot Greek tag- it would be easy to return them to the Greek page. - @Mahagaja I will be honored to write the WT page!
- @Wikitiki89 the sorting order was the reason for creating
- All I was saying is that if the sorting order is only reason, it would be preferable to fix the sorting order rather than split it off. --WikiTiki89 17:15, 1 May 2018 (UTC)
- I don't know about that. Glancing through a few Italiot lemmas, I find several that are quite different from their Greece-Greek synonyms: ammài is not just a Latin-alphabet spelling of μάτι (máti), while ajarài and πεταλούδα (petaloúda) are completely different words. I don't know how paparasciànni would even be spelled in the Greek alphabet since standard Greece-Greek doesn't have a /ʃ/ sound. I definitely agree with keeping Italiot a distinct language. Incidentally, we should get rid of CAT:Italiot Greek, which is categorized as a regional dialect of Greek, and keep only CAT:Italiot Greek language. —Mahāgaja (formerly Angr) · talk 17:04, 1 May 2018 (UTC)
- If there were some way to have the Latin letters sort after the Greek letters, then I think it would be better not to have a separate language code for Italiot Greek. --WikiTiki89 15:39, 1 May 2018 (UTC)
- @Aearthrise: Thanks for that comprehensive explanation. So does anyone object to retiring the code
On 2017/2/24 07:44, User:Wyang made the page for 南河. But it was only today, more than a year later, that 南河 was added to the list of compounds including 南 on the 南 page. I manually added 南河 to the 南 compound list. I have no knowledge of computer programming, but I think that there should be some way to automatically 'grab' all the pages with a given character and add them to the compounds lists. --Geographyinitiative (talk) 22:56, 28 April 2018 (UTC)
- Sounds like you're looking for the Grease Pit, not the Beer Parlour. Korn [kʰũːɘ̃n] (talk) 21:56, 30 April 2018 (UTC)
Gheg
[edit]We need to decide if Gheg is treated as a language or not. I don't know enough about Albanian to have an opinion on how we treat it, but having it be both a language and a dialect is confusing. I saw that there was a discussion about this in 2011, and it was ruled inconclusive. Any thoughts? – Gormflaith (talk) 23:39, 30 April 2018 (UTC)
- That's a fun discussion to read, lots of classic Dick Laurent. I agree that the combined macrolanguage-dialect method is problematic and should be discussed. The Compendium of the World's Languages says: "Tosk and Gheg are mutually intelligible, and differ indeed only in certain points - most importantly in the rhotacism of Tosk: Gheg -VnV- = Tosk -VrV-; e.g. Gheg zani 'voice', Tosk zeri; and in the formation of the future tense (see Verb, below)." As a result, I would support a merger into
sq
with dialectal forms found only in Tosk or Gheg but not in Standard Albanian to be labelled and categorised as such. —Μετάknowledgediscuss/deeds 00:03, 1 May 2018 (UTC)
- Pinging @Etimo as our resident Albanian-speaker: should Gheg and Tosk be considered one language? Are they mutually intelligible? Do native speakers/scholars/references think of them as one language? - -sche (discuss) 17:16, 1 May 2018 (UTC)
- @Torvalu4 too, who I think has some knowledge of Albanian. --Per utramque cavernam 08:00, 2 May 2018 (UTC)
- My position on the matter is basically that of Μετάknowledge. I've never read any expert treat the 2 as separate lang.s. They're really dialect groups and there's a subtle progression from one extreme to the other. The differences btw. the 2 extremes aren't minor though; it would be like the difference btw. Southern American and Cockney. That contrast would stretch intelligibility to the max, but they're still English. I think the best way to deal with them is to enter them all under "Albanian" (no qualification) and then add a label (Gheg, etc.) in the definition block. Incidentally, Arvanitic (aat) and Arbëreshë (aae) are treated differently by Wikt as well, but they're Tosk dialects. Torvalu4 (talk) 02:06, 3 May 2018 (UTC)
- @Torvalu4 too, who I think has some knowledge of Albanian. --Per utramque cavernam 08:00, 2 May 2018 (UTC)
Technically speaking, Gheg and Tosk can be considered dialects of the same language, not two separate languages, as they differ only in some phonetic features. They are mutually intelligible although intelligibility in some cases could be affected by the speaker's cultural level. A similar debate arose in Albanian academic circles some times ago, although it was mainly of political nature, considering that the Gheg dialect has the oldest surviving literature and the total number of Gheg speakers slightly exceeds those of Tosk's (and therefore should be considered standard Albanian instead of Tosk). Tosk was made the standard Albanian language at the party-sponsored linguistic congress of 1972, as Albanian communists came almost exclusively from the South, thus making it a political decision which went not without repercussions. However some efforts have been made to introduce Gheg features in standard Albanian in order to better harmonize and to somehow reconcilethe two languages. Etimo (talk) 07:26, 2 May 2018 (UTC)
- OK, it sounds like they should be considered one language. I will start reheadering our few Gheg entries. - -sche (discuss) 01:28, 7 May 2018 (UTC)
- I've merged aae (Arbëreshë Albanian), aat (Arvanitika Albanian), aln (Gheg Albanian) into sq. - -sche (discuss) 23:10, 11 May 2018 (UTC)