a python error- too many values -
i'm trying extract proper noun tagged file. problem the code i'm trying gives error :
traceback (most recent call last): file "e:\pt\paragraph", line 35, in <module> sen1= noun(mylist[s]) file "e:\pt\paragraph", line 5, in noun word, tag = word.split('/') valueerror: many values unpack
the code works fine texts gives error.
the code:
def noun(words): nouns = [] word in words.split(): word, tag = word.split('/') if (tag.lower() == 'np'): nouns.append(word); return nouns def splitparagraph(paragraph): import re paragraphs = paragraph.split('\n\n') return paragraphs if __name__ == '__main__': import nltk rp = open("t3.txt", 'r') text = rp.read() mylist = [] para = splitparagraph(text) s in para: mylist.append(s) s in range(len(mylist)-1): sen1= noun(mylist[s]) sen2= noun(mylist[s+1])
the i'm trying works if remove 1st paragraph other wise gives error.
sample of text:
a/at good/jj man/nn-hl departs/vbz-hl ./. goodbye/uh-hl ,/,-hl mr./np-hl sam/np-hl./. sam/np rayburn/np was/bedz a/at good/jj man/nn ,/, a/at good/jj american/np ,/, and/cc ,/, third/od ,/, a/at good/jj democrat/np ./. he/pps was/bedz all/abn of/in these/dts rolled/vbn into/in one/cd sturdy/jj figure/nn ;/. ;/. mr./np speaker/nn-tl ,/, mr./np sam/np ,/, and/cc mr./np democrat/np ,/, at/in one/cd and/cc the/at same/ap time/nn ./. the/at house/nn-tl was/bedz his/pp$ habitat/nn and/cc there/rb he/pps flourished/vbd ,/, first/rb as/cs a/at young/jj representative/nn ,/, then/rb as/cs a/at forceful/jj committee/nn chairman/nn ,/, and/cc finally/rb in/in the/at post/nn for/in which/wdt he/pps seemed/vbd intended/vbn from/in birth/nn ,/, speaker/nn-tl of/in-tl the/at-tl house/nn-tl ,/, and/cc second/od most/ql powerful/jj man/nn in/in washington/np ./.
if remove 1st paragraph (a/at good/jj man/nn-hl departs...) code works. how solve problem.
thanks in advance.
your "word" contains more 1 "/". unpacking (tag, word) not work. you'll have figure out how want handle case tag/word has more 1 "/".
def noun(words): nouns = [] word in words.split(): items = word.split('/') if len(items) == 2: tag, word = items else: # else parse ....
i realized can use "maxsplit" option strings split method if want split on first "/".
>>> word = "a/b/c" >>> >>> word.split("/", 1) ['a', 'b/c']
Comments
Post a Comment