python wordcount练习

google python class上的练习

一个是统计文件中每个单词出现次数的，注意在用for line in f获取每行的字符的时候，为了从该行字符中提取单独的单词需要用split()方法。

sorted(iterable[, key][, reverse])

Return a new sorted list from the items in iterable.， sorted返回的是一个排好序的list

因此可以slicing，而dict是不可以slicing 的

def word_count_dict(filename):
  f=open(filename,'rU')
  dict={}
  for line in f:
    for word in line.split():
      word=word.lower()
      if word in dict:
        dict[word]+=1
      else:
        dict[word]=1
  f.close()
  return dict

def print_words(filename):
  word_count=word_count_dict(filename)
  words=sorted(word_count.keys())
  for word in words:
    print(word,word_count[word])

def get_count(word_count_tuple):
    return word_count_tuple[1]

def print_top(filename):  
  word_count=word_count_tuple(filename)
  dict1=sorted(word_count.items(),key=get_count,reverse=True)
  for item in dict1[:20]:
      print(item[0],item[1])

python wordcount练习

浏览过的版块