Welcome to freestuffs’s documentation!¶
This is a Python 3.x package which scrapes free stuff from Craigslist. freestuffs is under the MIT license. Hosted on Github.
This package can be used to create a web application, such as the Treasure-map (source), or for use on Twitter.
Installation¶
Install using pip, freestuffs requires Python 3 and these dependences:
- requests
- geopy
- folium
- BeautifulSoup4
- unidecode
Install:
pip install freestuffs
Documentation:
Introduction¶
Free Stuffs!¶
This is a Python 3.x package which scrapes free stuff from Craigslist. freestuffs is under the MIT license. Check out the source code and the docs.
- Using StuffScraper one can gather a list of free stuffs.
- Using StuffCharter, one can create an HTML map of the free stuffs.
This package can be used to create a web application, such as the Treasure-map (source), or for use on Twitter.
Installation¶
Install using pip, requires Python 3 and these dependences:
- requests
- geopy
- folium
- BeautifulSoup4
- unidecode
Install: .. code-block:: bash
pip install freestuffs
Getting Started¶
Stuffs¶
The stuff class corresponds to a Craiglist free stuff posting. It’s basic characteristics include title and location. Notably, there is no price attribute. If the posting has no image, the Wikipedia no-image image is used in it’s place.
>>> from freestuffs.stuff_scraper import StuffScraper
>>> stuffs = StuffScraper('montreal', 5).stuffs
>>> print(stuffs[0])
what: free shelves
where: Workman St, montreal
link: http://montreal.craigslist.ca/zip/5629811181.html
image: https://images.craigslist.org/00r0r_4p06sM5Hn4O_300x300.jpg
Scape Stuffs¶
The StuffScraper class will scrape Craiglist for free stuff.
>>> from freestuffs.stuff_scraper import StuffScraper
>>> stuffs = StuffScraper('montreal', 5).stuffs # precise=False
>>> print(stuffs[0].thing) # Title
Meubles / furniture
In order for the scraper to automatically scrape for latitude and longitude coordinates, pass in the parameter precise=True into the constructor.
>>> from freestuffs.stuff_scraper import StuffScraper
>>> stuffs = StuffScraper('montreal', 5, precise=True).stuffs
>>> print(stuffs[0].coordinates)
['45.617854', '-73.633931']
Chart Stuffs¶
The StuffCharter class will produce a folium Map object populated with free stuff from the StuffScraper.
>>> from freestuffs.stuff_scraper import StuffScraper
>>> from freestuffs.stuff_charter import StuffCharter
>>> stuffs = StuffScraper('montreal', 5, precise=True).stuffs
>>> stuffs_chart = StuffCharter(stuffs)
call save_map(path) generate html map
>>> type(map.treasure_map)
<class 'folium.folium.Map'>
The StuffCharter object is a wrapper around the folium.Map.
Call save_map(HTML_PATH, CSS_PATH)
>>> stuffs_chart.save_map('webmap', 'static/style.css')
This function creates a directory if it is not found in the path. Call instead
save_test_map()
to generate an HTML map in the current directory.
Legend¶
- The smaller the posting, the older it is.
- The darker the border, the higher the amount of overlap.
Triage¶
The triage checks for regex search in this order:
- Red are furniture “(wood, shelf, shelves, table, chair, scrap, desk)”.
- Blue are electronics “(tv, sony, ecran, speakers, wire, electronic, saw, headphones, arduino)”.
- Black are the “desired” stuffs “(book, games, cool, guide, box)”.
- White is default (no regex search matches).
Cookbook¶
Stuffs¶
The stuff class corresponds to a Craiglist free stuff posting. It’s basic characteristics include title and location. Notably, there is no price attribute. If the posting has no image, the Wikipedia no-image image is used in it’s place.
>>> from freestuffs.stuff_scraper import StuffScraper
>>> stuffs = StuffScraper('montreal', 5).stuffs
>>> print(stuffs[0])
what: free shelves
where: Workman St, montreal
link: http://montreal.craigslist.ca/zip/5629811181.html
image: https://images.craigslist.org/00r0r_4p06sM5Hn4O_300x300.jpg
Scape Stuffs¶
The StuffScraper class will scrape Craiglist for free stuff. The two required args are the city name and the quantity of stuff to scrape. The city name must conform to the Craiglist url name. Cities like New York, are then ‘newyork’.
>>> from freestuffs.stuff_scraper import StuffScraper
>>> stuffs = StuffScraper('montreal', 5).stuffs # precise=False
>>> print(stuffs[0].thing) # Title
Meubles / furniture
In order for the scraper to automatically scrape for latitude and longitude coordinates, pass in the parameter precise=True into the constructor.
>>> from freestuffs.stuff_scraper import StuffScraper
>>> stuffs = StuffScraper('montreal', 5, precise=True).stuffs
>>> print(stuffs[0].coordinates)
['45.617854', '-73.633931']
Otherwise, one can call stuffs[0].find_coordinates()
in order to set (and scrape) the stuff coordinates one by one.
Pass in use_cl=true
in order to ask for user input and override
the location entered in the __init__.:
>>> from freestuffs.stuff_scraper import StuffScraper
>>> stuffs = StuffScraper('ill decide later', 1, use_cl=True).stuffs
What major city are you near? (or, 'help') newyork
>>> print(stuffs[0].location)
East Harle, New York
Chart Stuffs¶
The StuffCharter class will produce a folium Map object populated with free stuff from the StuffScraper. The StuffCharter object is a wrapper around the folium.Map.
>>> from freestuffs.stuff_scraper import StuffScraper
>>> from freestuffs.stuff_charter import StuffCharter
>>> stuffs = StuffScraper('montreal', 5, precise=True).stuffs
>>> stuffs_chart = StuffCharter(stuffs)
call save_map(path) to generate html map
>>> type(map.treasure_map)
<class 'folium.folium.Map'>
Call save_map(HTML_PATH, CSS_PATH)
in order to save an HTML
map from the folium.Map object. (equivelant to calling folium.Map.save(path)
)
>>> stuffs_chart.save_map('webmap', 'static/style.css')
This function creates a directory if it is not found in the path. Call instead
save_test_map()
to generate an HTML map in the current directory.
Optionally pass in an address
or zoom
level into its construction.
Otherwise if do_create_map
is False
, these attributes can be
modified manually.
>>> from freestuffs.stuff_scraper import StuffScraper
>>> from freestuffs.stuff_charter import StuffCharter
>>> stuffs = StuffScraper('montreal', 5, precise=True).stuffs
>>> stuffs_chart = StuffCharter(stuffs, zoom=15, do_create_map=False)
>>> stuffs_chart.zoom = 10 # default 13
>>> stuffs_chart.create_map()
call save_map(path) to generate html map
The stuff markers are colored circles in diminishing order; the small the circle, the older the posting (this prevents inaccessible overlaps).
And you can add an address (not zoom) after the map has been created:
>>> stuffs_chart.add_address('5989 Rue du Parc, Montreal, Quebec')
>>> print(stuffs_chart.address)
5989 Rue du Parc, Montreal, Quebec
And why stop at one address maker(the address
attribute will
always be the last address added):
>>> stuffs_chart.add_address('5989 Rue du Parc, Montreal, Quebec')
>>> stuffs_chart.add_address('604 Rue Saint Joseph, Montreal, Quebec')
>>> print(stuffs_chart.address)
604 Rue Saint Joseph, Montreal, Quebec
Override the css by adding links to the folium object header:
>>> import folium
>>> osm_map = stuffs_chart.treasure_map
>>> folium_figure = osm_map.get_root()
>>> folium_figure.header._children['bootstrap'] = folium.element.CssLink('/static/css/style.css')
To use the treasure_map as a template in a python web app, the leaflet bootstrap css might conflict with the user defined styles. Before saving the map, add a CssLink.
The fastest way to get a map up and running, is to pass is_testing=True
into the constructor:
>>> from freestuffs.stuff_scraper import StuffScraper
>>> from freestuffs.stuff_charter import StuffCharter
>>> stuffs = StuffScraper('montreal', 5, precise=True).stuffs
>>> stuffs_chart = StuffCharter(stuffs, is_testing=True)
BEWARNED, this map is likely inaccurate:
Craigslist denizens care not for computer-precision
freestuffs package¶
Submodules¶
freestuffs.city_list module¶
Craiglist friendly city names.
Run this script inorder to print out a dict of Craiglist city names, for use in creating a valid url.
- Attributes:
- CITIES – A dict of cities, with the human-friendly
name as the key and url friendly name as the value.
-
freestuffs.city_list.
scrape_cities
()¶ Scrape city and link for dict of valid names.
freestuffs.stuff module¶
This houses the Stuff class.
Use stuff_scraper in order to gather a list of stuffs. For testing, the reverse Geolocator is at nominatim.openstreetmap.org.
Example usage:
>>> from stuff_scraper import StuffScraper
>>> stuffs = StuffScraper('montreal', 5).stuffs
>>> print(stuffs[0].thing) # Title
Meubles / furniture
>>> stuffs[0].find_coordinates() # pass precise=True in constructor
>>> print(stuffs[0].coordinates) # to automatically fetch coordinates
['45.617854', '-73.633931']
-
class
freestuffs.stuff.
Stuff
(thing, url, location, image, user_location)¶ Bases:
object
A freestuff Craigslist object.
Fill this object with the information from a craigslist page. (There is no price attribute, because it is designed for invaluable things). The precise coordinates are not initially set, because they require significantly more requests. See class method get_coordinates().
- Attributes:
- thing – title of object passed explicitly
- url – constructed from url, implicit
- image – passed explicitly
- user_location – passed explicitly, requires clean up
- coordinates – array of longitude and latitude
- Keyword arguements:
- thing –
- url –
- location –
- image –
- user_location – must conform to valid craiglist url
-
find_coordinates
()¶ Get and set longitude and Latitude
Scrape individual posting page, if no coordinates are found, cascade precision (try location, try user_location, or set to zero). Returns an array, first latitude and then longitude.
freestuffs.stuff_charter module¶
Chart where free things are.
The StuffCharter class is a wrapper around the folium openstreetmap python object, which in turn generates a leaflet map.
Example usage:
>>> from stuff_scraper import StuffScraper
>>> from stuff_charter import StuffCharter
>>> stuffs = StuffScraper('montreal', 5, precise=True).stuffs
>>> treasure_map = StuffCharter(stuffs)
call save_map(path) generate html map
>>> treasure_map.save_test_map() # saves map in current dir
BEWARNED, this map is likely inaccurate:
Craigslist denizens care not for computer-precision
-
class
freestuffs.stuff_charter.
StuffCharter
(stuffs, address=None, zoom=13, do_create_map=True, is_testing=False, is_flask=False)¶ Bases:
object
Post folium map of freestuffs.
After constructing Mappify map object, call create_map and pass in map_path in order to create the HTML map.
- Attributes:
- treasure_map – an OSM folium map object
- stuffs – list of free stuff
- user_location – the user’s location
- start_coordinates – the origin coordinates for city
- zoom – default map zoom
- Keyword arguments:
stuffs – a list of stuff objects
address – for an optional map marker of the user address.
- do_create_map – set to False to override modify attributes
before create_map.
is_testing – use to test module from commandline
is_flask – automatically create map for treasure-map
zoom – the map default zoom level
-
add_address
(_address)¶ Add address to folium map
-
create_map
(is_testing=False, is_flask=False)¶ Create a folium Map object, treasure_map.
treasure_map can be used to save an html leaflet map. This method is called automatically on __init__ unless do_create_map is set to False.
- Keyword arguments:
- is_testing – creates a map in webmap directory
- is_flask – creates a flask map
-
find_city_center
(location)¶ Return city center longitude latitude.
-
save_flask_map
()¶ Create html map in flask server.
-
save_map
(map_path, css_path=None)¶ Create html map in map_path.
- Keyword arguments:
map_path – the path to create_map in
- css_path – the path to override css
(defaults to bootstrap via folium)
-
save_test_map
()¶ Create html map in current directory.
Should have python -m http.server running in directory
-
sort_stuff
(stuff)¶ Return a color according to regex search.
- Furniture pattern, red
- Electronics pattern, blue
- Miscellaneous pattern, black
- no match, white
sort_stuff will return with the first pattern found in that order.
- TODO:
- Set and patterns as modifiable attributes.
freestuffs.stuff_scraper module¶
This module is a Craigslist scraper.
Example usage:
>>> from stuff_scraper import StuffScraper
>>> stuffs = StuffScraper('montreal', 5).stuffs
>>> print(stuffs[0].thing) # Print title
Meubles / furniture
-
class
freestuffs.stuff_scraper.
StuffScraper
(place, _quantity, precise=False, use_cl=False)¶ Bases:
object
The freestuffs Craigslist scraper.
Compile parrellel lists of stuff attributes in order to store a freestuffs list, with an option for including stuff coordinates.
- Attributes:
- stuffs – a list of stuff objects
- soup – bs4 soup of Craiglist page
- place – the city to search, in Craigslist friendly format
- locs, things, images, urls – stuff attributes lists
- quantity – how many stuffs gathered
- Keyword arguments:
_quantity – how many stuffs to gather
- precise – A boolean to explicitly use geolocator
and crawl individual posting URL
use_cl – user input for place
-
get_images
(_soup)¶ Scrape images.
Uses wikpedia No-image image if no image is found.
- Keyword arguments:
- soup - bs4 object of a Craiglist freestuffs page
-
get_locations
(user_place, _soup)¶ Scape locations.
Returns a list of locations, more or less precise. Concatnate user_place to string in order to aid geolocator in case of duplicate location names in world. Yikes.
- Keyword arguments:
- user_place – the city, in Craigslist format
- soup – bs4 object of a Craiglist freestuffs page
-
get_things
(_soup)¶ Scrape titles.
Keyword arguments: - soup - bs4 object of a Craiglist freestuffs page
-
get_urls
(_soup)¶ Scrape stuff urls.
- Keyword arguments:
- soup - bs4 object of a Craiglist freestuffs page
-
refine_city_name
(user_place)¶ Refine location for two-word cities.
-
setup_place
()¶ Take cl input of user location.