Utilizator:Flubot/cedillas

De la Wikționar, dicționarul liber
Jump to navigation Jump to search

I'm currently (20th of March 2011) changing cedillas in foreign language words (not Turkish words of course). I'm using replace.py with the following fix in fixes.py. This fix is not complete and it must not be left running unsupervised, ie don't give a (always). Although I am as careful as I can, it is possible that some Turkish word may have been changed. It is then absolutely important that after the end of this procedure all Turkish words be checked again for errors.

Categories and interwikis are not to be changed yet.

Done: el, fr, de, es, en, it, no, nl, pl, pt, ru, vec, zh, ja

I've started editing Romanian words, in groups of 200. I get an xml file from Special:Export and examine it for Turkish sections (limba|tur), Turkish words ([[tur) and audio files. When I find a new Turkish word, I add it to the mask list below. Then I run the fix, carefully, having read the modifications before accept them. I hope that I have not made any errors. --Flyax 27 martie 2011 17:19 (UTC)

'romanian': {
        'regex' : True,
        'msg': {
           'el':u'replacing ţ and ş with ț, ș',
        },
        'replacements': [
	#mask
	(u'\[\[Fişier\:([^\|]*)ţ([^\|]*)ţ', u'[[Fișier:\\1τ\\2τ'),
	(u'\[\[Fișier\:([^\|]*)ţ([^\|]*)ţ', u'[[Fișier:\\1τ\\2τ'),
	(u'\[\[Fişier\:([^\|]*)ţ', u'[[Fișier:\\1τ'),
	(u'\[\[Fișier\:([^\|]*)ţ', u'[[Fișier:\\1τ'),
	(u'\[\[Fişier\:([^\|]*)ş([^\|]*)ş', u'[[Fișier:\\1@\\2@'),
	(u'\[\[Fișier\:([^\|]*)ş([^\|]*)ş', u'[[Fișier:\\1@\\2@'),
	(u'\[\[Fişier\:([^\|]*)ş', u'[[Fișier:\\1@'),
	(u'\[\[Fişier\:([^\|]*)ş', u'[[Fișier:\\1@'),
	(u'\[\[([a-z]+)\:([^\]]*)ş', u'[[\\1:\\2@'),
	(u'\[\[([a-z]+)\:([^\]]*)ş([^\]]*)ş', u'[[\\1:\\2@\\3@'),
	(u'\[\[([a-z]+)\:([^\]]*)ş', u'[[\\1:\\2@'),
	(u'\[\[([a-z]+)\:([^\]]*)ţ([^\]]*)ţ', u'[[\\1:\\2&\\3&'),
	(u'\[\[([a-z]+)\:([^\]]*)ţ', u'[[\\1:\\2&'),
	(u'\[\[Categorie\:([^\]]*)ţ([^\]]*)ţ', u'[[Categorie:\\1@\\2@'),
	(u'\[\[Categorie\:([^\]]*)ţ', u'[[Categorie:\\1@'),
	(u'\[\[Categorie\:([^\]]*)ş', u'[[Categorie:\\1ψ'),
	(u'\{\{trad\|([^\|]+)\|([^\}]*)ş([^\}]*)ş', u'{{trad|\\1|\\2@\\3@'),
	(u'\{\{trad\|([^\|]+)\|([^\}]*)ş', u'{{trad|\\1|\\2$'),
	(u'başkent', u'ba@kent'),
	(u'şehir', u'@ehir'),
	(u'akşın', u'ak@ın'),
	(u'myşyak', u'my@yak'),
	(u'mişyak', u'mi@yak'),
	(u'myşyák', u'my@yák'),
	(u'kaşkaval', u'ka@kaval'),
	(u'düşman', u'dü@man'),
	(u'paşa', u'pa@a'),
	(u'tebeşir', u'tebe@ir'),
	(u'tavşan', u'tav@an'),
	#modify
	(u'Ţ', u'Ț'),
        (u'ş', u'ș'),
        (u'ţ', u'ț'),
	#unmask
	(u'\[\[Fișier\:([^\|]*)τ([^\|]*)τ', u'[[Fișier:\\1ţ\\2ţ'),
	(u'\[\[Fișier\:([^\|]*)τ', u'[[Fişier:\\1ţ'),
	(u'\[\[Fișier\:([^\|]*)@([^\|]*)@', u'[[Fișier:\\1ş\\2ş'),
	(u'\[\[Fișier\:([^\|]*)@', u'[[Fișier:\\1ş'),
	(u'\[\[([a-z]+)\:([^\]]*)@([^\]]*)@', u'[[\\1:\\2ş\\3ş'),
	(u'\[\[([a-z]+)\:([^\]]*)&([^\]]*)&', u'[[\\1:\\2ţ\\3ţ'),
	(u'\[\[([a-z]+)\:([^\]]*)@', u'[[\\1:\\2ş'),
	(u'\[\[([a-z]+)\:([^\]]*)&', u'[[\\1:\\2ţ'),
	(u'\[\[Categorie\:([^\]]*)@([^\]]*)@', u'[[Categorie:\\1ţ\\2ţ'),
	(u'\[\[Categorie\:([^\]]*)@', u'[[Categorie:\\1ţ'),
	(u'\[\[Categorie\:([^\]]*)ψ', u'[[Categorie:\\1ş'),
	(u'\{\{trad\|([^\|]+)\|([^\}]*)@([^\}]*)@', u'{{trad|\\1|\\2ş\\3ş'),
	(u'\{\{trad\|([^\|]+)\|([^\}]*)\$', u'{{trad|\\1|\\2ş'),
	(u'my@yak', u'myşyak'),
	(u'mi@yak', u'mişyak'),
	(u'my@yák', u'myşyák'),
	(u'ba@kent', u'başkent'),
	(u'@ehir', u'şehir'),
	(u'ak@ın', u'akşın'),
	(u'ka@kaval', u'kaşkaval'),
	(u'dü@man', u'düşman'),
	(u'pa@a', u'paşa'),
	(u'tebe@ir', u'tebeşir'),
	(u'tav@an', u'tavşan'),
	      ],
    },