Let’s turn {on|off} TV
August 1, 2008
I am not a TV addicted, but sometimes I like to watch movies on TV. As I don’t have money to spend with Pay-per-view or Cable TV, so before turn on mine TV and tune on some channel, I search at Folha Ilustrada for good films.
I think the Folha Ilustrada is better to consult, because in there all movies are classified as good, bad, not so bad, etc..
Yesterday, I was already bored to put the URL in my Firefox browser to go Folha Ilustrada and find something intersting. Then, I make a python script to bring to me the informations, look:
#!/usr/bin/python
import urllib2
import datetime
import re
from textwrap import TextWrapper
from BeautifulSoup import BeautifulSoup
class Films():
_url = 'http://www1.folha.uol.com.br/folha/ilustrada/filmes/'
_today = datetime.date.today().strftime('%A')
_days_of_week = { 'Monday':'segunda',
'Tuesday':'terca',
'Wednesday':'quarta',
'Thursday':'quinta',
'Friday':'sexta',
'Saturday':'sabado',
'Sunday':'domingo'
}
def __init__(self):
self.view_films()
def view_films(self):
regex = re.compile('localItem*')
clean_tags = re.compile('<(/|)(div|p|h1|h3|b|i)(| class=".*")>')
text_wrapper = TextWrapper()
text_wrapper.width = 72
page = self._url + self._days_of_week[self._today] + '.shtml'
resp = urllib2.urlopen(page)
html = resp.read()
resp.close()
for i in BeautifulSoup(''.join(html)).findAll('div'):
try:
if re.match(regex,i['class']):
formatted = text_wrapper.wrap(re.sub(clean_tags,'',i.__str__()))
for paragraph in formatted:
print paragraph.decode('utf8')
print '\n'
except:
pass
if __name__ == "__main__":
Films()
Bye.







