Sunday, May 19, 2024
 Popular · Latest · Hot · Upcoming
12
rated 0 times [  12] [ 0]  / answers: 1 / hits: 9485  / 2 Years ago, sun, april 24, 2022, 10:01:08

I have installed python-nltk on Ubuntu Server 12.04 using apt-get.



But when I try to download a corpus, I get the following error:



$ python
Python 2.7.3 (default, Feb 27 2014, 19:58:35)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.download('brown')
[nltk_data] Error loading brown: HTTP Error 401: Authorization
[nltk_data] Required
False


Am I missing some configuration or additional package?


More From » 12.04

 Answers
4

The DEFAULT_URL used in downloader.py of the Ubuntu packaged version still uses:



DEFAULT_URL = 'http://nltk.googlecode.com/svn/trunk/nltk_data/index.xml'


But the current data server is:



DEFAULT_URL = "http://nltk.github.com/nltk_data/"


You can of course install from source or... modify your already installed version to point to the new server like this:



 sudo perl -pi -e 's#DEFAULT_URL = .*#DEFAULT_URL = "http://nltk.github.com/nltk_data/"#' /usr/lib/python2.7/dist-packages/nltk/downloader.py


You can then install the "brown" corpus:



$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.download('brown')
[nltk_data] Downloading package 'brown' to /home/sylvain/nltk_data...
[nltk_data] Unzipping corpora/brown.zip.
True
>>> from nltk.corpus import brown
>>> brown.words()
['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', ...]
>>>

[#23105] Monday, April 25, 2022, 2 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
jokaned

Total Points: 315
Total Questions: 116
Total Answers: 119

Location: Somalia
Member since Mon, Feb 27, 2023
1 Year ago
;