Yes, there are some: there is igerman98_all.xml.bz2 - German lemma list in XML format based on ispell word list from Niels Ott's BananaSplit.
Or, you could generate the list from your text with the TreeTagger tool. For each text token it assigns a tag, that tells us this is a noun, this is an articel.
But the problem with the dictionaries is: Most dictionaries include also the compound words, while we don't want them in our dictionary. The splitter needs the words in its very basic form. If you have compound words in your dictionary, the splitter doesn't break them up further ...
So I decided to create myself a list. It was easy. I started from the 500 most used search terms on my website. And then i splitted them manually. It was easy and did not take long.
Yes, there are some: there is igerman98_all.xml.bz2 - German lemma list in XML format based on ispell word list from Niels Ott's BananaSplit.
Or, you could generate the list from your text with the TreeTagger tool. For each text token it assigns a tag, that tells us this is a noun, this is an articel.
But the problem with the dictionaries is: Most dictionaries include also the compound words, while we don't want them in our dictionary. The splitter needs the words in its very basic form. If you have compound words in your dictionary, the splitter doesn't break them up further ...
So I decided to create myself a list. It was easy. I started from the 500 most used search terms on my website. And then i splitted them manually. It was easy and did not take long.