pycraigslist

A fast and expressive Craigslist API wrapper.

Disclaimer

  • I do not work or have an affiliation with Craigslist.

  • This library is intended for educational purposes. It is not advised to crawl and download data from Craigslist.

Installation

pip install pycraigslist

Quick Start

Find cars & trucks for sale with keyword "Mazda Miata" in the East Bay Area, California:

import pycraigslist

miatas = pycraigslist.forsale.cta(site="sfbay", area="eby", query="Mazda Miata")
for miata in miatas.search():
    print(miata)

>>> {'country': 'US',
    'region': 'CA',
    'site': 'sfbay',
    'area': 'eby',
    'category': 'cto',
    'id': '7291715564',
    'repost_of': '',
    'last_updated': '2021-03-15 09:06',
    'title': '1990 Mazda Miata',
    'neighborhood': 'oakland lake merritt / grand',
    'price': '$5,000',
    'url': 'https://sfbay.craigslist.org/eby/cto/d/oakland-1990-mazda-miata/7291715564.html'}
    # ...

Background

This library is intended to be expressive and easy to use.

pycraigslist classes

  • pycraigslist.community (craigslist.org > community)

  • pycraigslist.events (craigslist.org > event calendar)

  • pycraigslist.forsale (craigslist.org > for sale)

  • pycraigslist.gigs (craigslist.org > gigs)

  • pycraigslist.housing (craigslist.org > housing)

  • pycraigslist.jobs (craigslist.org > jobs)

  • pycraigslist.resumes (craigslist.org > resumes)

  • pycraigslist.services (craigslist.org > services)

We can search for posts in parent classes. For example, finding paid gigs in Portland, Oregon:

import pycraigslist

paid_gigs = pycraigslist.gigs(site="portland", is_paid=True)
for gig in paid_gigs.search():
    print(gig)

>>> {'country': 'US',
    'region': 'OR',
    'site': 'portland',
    'area': 'mlt',
    'category': 'lbg',
    'id': '7295392821',
    'repost_of': '7292985211',
    'last_updated': '2021-03-22 13:00',
    'title': 'Packing and moving',
    'neighborhood': 'SE Portland',
    'price': '',
    'url': 'https://portland.craigslist.org/mlt/lbg/d/portland-packing-and-moving/7295392821.html'}
    # ...

pycraigslist subclasses

Most pycraigslist classes have subclasses to allow for categorical searches. For example:

  • pycraigslist.forsale.bia (craigslist.org > for sale > bikes)

  • pycraigslist.forsale.cta (craigslist.org > for sale > cars & trucks)

  • pycraigslist.housing.apa (craigslist.org > housing > apartments / housing for rent)

  • pycraigslist.housing.roo (craigslist.org > housing > apartments / rooms & shares)

Finding pycraigslist subclasses

Use class method .get_categories() to search for subclasses. The resulting keys are the subclass names.

import pycraigslist

print(pycraigslist.housing.get_categories())

>>> {'apa': 'apartments / housing for rent',
    'swp': 'housing swap',
    'off': 'office & commercial',
    'prk': 'parking & storage',
    'rea': 'real estate',
    'reb': 'real estate - by dealer',
    'reo': 'real estate - by owner',
    'roo': 'rooms & shares',
    'sub': 'sublets & temporary',
    'vac': 'vacation rentals',
    'hou': 'wanted: apts',
    'rew': 'wanted: real estate',
    'sha': 'wanted: room/share',
    'sbw': 'wanted: sublet/temp'}

We’d choose pycraigslist.housing.vac if we’re interested in searching for vacation rentals.

Finding and using filters

We can apply filters to our search. Use .get_filters() to find valid filters for a class or subclass instance.

import pycraigslist

tokyo_autos = pycraigslist.forsale.cta(site="tokyo")
print(tokyo_autos.get_filters())

>>> {'query': '...', 'search_titles': 'True/False', 'has_image': 'True/False',
    'posted_today': 'True/False', 'bundle_duplicates': 'True/False',
    'search_distance': '...', 'zip_code': '...', 'min_price': '...', 'max_price': '...',
    'make_model': '...', 'min_year': '...', 'max_year': '...', 'min_miles': '...',
    'max_miles': '...', 'min_engine_displacement': '...', 'max_engine_displacement': '...',
    'condition': ['新品', 'ほぼ新品', '美品', '良品', '使用に問題なし', 'サルベージ'],
    'auto_cylinders': ['3気筒', '4気筒', '5気筒', '6気筒', '8気筒', '10気筒', '12気筒', 'その他'],
    'auto_drivetrain': ['前輪', '後輪', '4WD'],
    'auto_fuel_type': ['ガソリン', 'ディーゼル', 'ハイブリッド', '電気', 'その他'],
    'auto_paint': ['ブラック', 'ブルー', 'ブラウン', 'グリーン', 'グレー', 'オレンジ', 'パープル',
                   'レッド', 'シルバー', 'ホワイト', 'イエロー', 'カスタム'],
    'auto_size': ['コンパクト', 'フルサイズ', '中型', 'サブコンパクト'],
    'auto_title_status': ['クリーン', 'サルベージ', '再生', '部品のみ', '先取特権', '不明'],
    'auto_transmission': ['MT', 'AT', 'その他'],
    'auto_bodytype': ['バス', 'コンバーチブル', 'クーペ', 'ハッチバック', 'ミニバン', 'オフロード',
                      'ピックアップ', 'セダン', 'トラック', 'SUV', 'ワゴン', 'バン', 'その他'],
    'language': ['afrikaans', 'català', 'dansk', 'deutsch', 'english', 'español', 'suomi',
                 'français', 'italiano', 'nederlands', 'norsk', 'português', 'svenska',
                 'filipino', 'türkçe', '中文', 'العربية', '日本語', '한국말', 'русский',
                 'tiếng việt']}

Using this information, we can find cars & trucks with clean (クリーン) titles in Tokyo, Japan:

import pycraigslist

tokyo_autos = pycraigslist.forsale.cta(site="tokyo", auto_title_status="クリーン")
for auto in tokyo_autos.search():
    print(auto)

>>> {'country': 'JP',
    'region': '',
    'site': 'tokyo',
    'area': '',
    'category': 'cto',
    'id': '7301105503',
    'repost_of': '',
    'last_updated': '2021-04-03 14:04',
    'title': 'Suzuki Jimny 660 XG 4WD Keyless Entry Aluminum Wheel Non-Smoking Car',
    'neighborhood': 'Chiba Ken, Noda shi, Funakata 1630-1',
    'price': '¥650,000',
    'url': 'https://tokyo.craigslist.org/cto/d/suzuki-jimny-660-xg-4wd-keyless-entry/7301105503.html'}
    # ...

When applying many filters, pass a dictionary of filters into the filters keyword parameter. Note: keyword argument filters will override filters if there are conflicting keys. For example:

import pycraigslist

bike_filters = {
    "bicycle_frame_material": "steel",
    # array of filter values are accepted
    "bicycle_wheel_size": ["650C", "700C"],
    "bicycle_type": "road",
}
# we'd still get titanium road bikes with size 650C or 700C wheels
titanium_bikes = pycraigslist.forsale.bia(
    site="sfbay", area="sfc", bicycle_frame_material="titanium", filters=bike_filters
)

Searching for posts

To search for Craigslist posts, use .search(). .search() will return a dictionary of post attributes (type str) and will search for every post by default. Use the limit keyword parameter to add a stop limit to a query. For example, use limit=50 to get 50 posts. There is a maximum of 3000 posts per query.

Find the first 20 posts for farming and gardening services in Denver, Colorado:

import pycraigslist

gardening_services = pycraigslist.services.fgs(site="denver")
for service in gardening_services.search(limit=20):
    print(service)

>>> {'country': 'US',
    'region': 'CO',
    'site': 'denver',
    'area': '',
    'category': 'fgs',
    'id': '7301324564',
    'repost_of': '6974119634',
    'last_updated': '2021-04-03 11:47',
    'title': '🌲 Tree Removal/Trimming, Stump Grind: LICENSED/INSURED! 720-605-1584',
    'neighborhood': 'All Areas',
    'price': '',
    'url': 'https://denver.craigslist.org/fgs/d/littleton-tree-removal-trimming-stump/7301324564.html'}
    # ...

Use .search_detail() to get detailed Craigslist posts. The limit keyword parameter in .search also applies to .search_detail. Set include_body=True to include the post’s body in the output. By default, include_body=False. Disclaimer: .search_detail is more time consuming than .search.

Get detailed posts with the post body for all cars & trucks for sale in Abilene, Texas:

import pycraigslist

all_autos = pycraigslist.forsale.cta(site="abilene")
for auto in all_autos.search_detail(include_body=True):
    print(auto)

>>> {'country': 'US',
    'region': 'TX',
    'site': 'abilene',
    'area': '',
    'category': 'cto',
    'id': '7309894792',
    'repost_of': '',
    'last_updated': '2021-04-20 12:17',
    'title': '2009 Mercedes GL-320',
    'neighborhood': 'Brownwood',
    'price': '$12,000',
    'url': 'https://abilene.craigslist.org/cto/d/brownwood-2009-mercedes-gl-320/7309894792.html',
    'lat': '31.729000',
    'lon': '-99.019000',
    'address': '',
    'misc': ['2009 mercedes-benz gl-class'],
    'condition': 'excellent',
    'drive': 'fwd',
    'fuel': 'diesel',
    'odometer': '100700',
    'paint_color': 'black',
    'title_status': 'clean',
    'transmission': 'automatic',
    'body': 'BEAUTIFUL car inside and out!! Diesel with only 100k, mechanic says its in great condition.'}
    # ...

Additional attributes

__doc__: Gets category name.

url: Gets full URL.

count: Gets number of posts.

import pycraigslist

east_bay_apa = pycraigslist.housing.apa(site="sfbay", area="eby", max_price=800)

1

print(east_bay_apa.doc)

‘apartments / housing for rent’

2

print(east_bay_apa.url)

https://sfbay.craigslist.org/search/eby/apa?searchNearby=1&s=0&max_price=800'

3

print(east_bay_apa.count)

56

Exceptions

pycraigslist has the following exceptions:

  • MaximumRequestsError : exceeds maximum retries for a query

To use pycraigslist exceptions, import / import from pycraigslist.exceptions. For example:

import pycraigslist
from pycraigslist.exceptions import MaximumRequestsError

try:
    sf_bikes = pycraigslist.forsale.bia(site="sfbay", area="sfc", min_price=50)
    for bike in sf_bikes.search():
        print(bike)
except MaximumRequestsError:
    print("Yikes! Something's up with the network.")

GitHub

https://github.com/irahorecka/pycraigslist

Source: https://pythonawesome.com/a-fast-and-expressive-craigslist-api-wrapper/