Web Scraping from Bloomberg – Python

The below code is a simple web scraper to grab the name and price of the S&P 500 Index from the Bloomberg website. There are a few different libraries for scraping available for Python to create the same output, but these were the easiest to work with. In the next few posts I will show some ways to grab multiple fields and save them in a csv file or SQL database. Also, I will explore automating these scripts with bots and trying to make the number of requests as human-like as possible.  

import urllib2
from bs4 import BeautifulSoup

#specify url
quote_page = 'http://www.bloomberg.com/quote/SPX:IND'

#query the website and return the html to the pariable 'page'
page = urllib2.urlopen(quote_page)

#parse the html using beautiful soup and store in variable 'soup'
soup = BeautifulSoup(page, 'html.parser')

#take out the <div> of name and get its value
name_box = soup.find('h1', attrs={'class': 'name'})
name = name_box.text.strip() #strip() used to remove starting and trailing
print name

#get index price
price_box = soup.find('div', attrs={'class': 'price'})
price = price_box.text
print price


S&P 500 Index