CoinMarketCal: Scraping Information for Crypto Analysis

Warning:

Coinmarketcal.com has changed layouts, thus the code no longer applies – still a great guide & reference.

 

 

If you’ve ever been to CoinMarketCal.com, you’ll noticed an abundance of crowd sourced information about cryptocurrency updates, roadmaps, and other changes. I wanted to use the crowdsourced roadmap data to see if it had any impact on the coins’ prices. 

Scraping the data

Unfortunately, CoinMarketCal does not have an API to easily gather the data. We can still pull the data in as raw HTML code and parse through it using regular expressions in Python. We’ll start with our imports and the html pull:

import re
import urllib.request
import pandas as pd

text = urllib.request.urlopen('http://coinmarketcal.com/').read().decode()

If we were to print the text we would get some messy HTML code. After some manual searching, we can determine where the information we want is. For the purposes of this project, we will scrape the coin name, coin ticker, update date, and update certainty. We can do some basic tests with this information – like does the update increase the coin price if it’s above a certainty threshold.

The actual data scraping uses regular expressions, which more information can be located here. It looks like the following:

coins = re.findall('(?<=Coin -->\n\t\t\t\t\t\t\t\t\t<h5><strong>).+(?=</strong>)',text)
dates = re.findall('(?<=\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<h5><strong>).+(?=</strong>)', text)
certainty = re.findall('(?<=aria-valuenow=").+?(?=" role)', text)

We can put all the lists together, turn it into a DataFrame and export it as a csv. The code in it’s entirety looks like:

import re
import urllib.request
import pandas as pd
import numpy as np

text = urllib.request.urlopen('http://coinmarketcal.com/').read().decode()

coins = re.findall('(?<=Coin -->\n\t\t\t\t\t\t\t\t\t<h5><strong>).+(?=</strong>)',text)

parts = [coin.split() for coin in coins]

dates = re.findall('(?<=\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<h5><strong>).+(?=</strong>)', text)

certainty = re.findall('(?<=aria-valuenow=").+?(?=" role)', text)

sheet = []
for i in range(len(coins)):
    if len(parts[i]) == 2:
        sheet.append([parts[i][0], parts[i][1][1:-1], dates[i], int(certainty[i])])

csv = pd.DataFrame(sheet)
csv.to_csv('coincal.csv')

From here we can begin to test price data against these events!

Getting Started with the CryptoCompare API in Python

If you’ve ever browsed CryptoCompare.com, you’ll find it to be a wealth of crypto knowledge on anything from mining to ICO’s to exchanges. It’s user friendly website is built to be accessed by anyone regardless of experience, but what many people don’t realize is the vast amount of information that can be used for analytics […]

Researching ICO’s: A Study of Two Initial Coin Offerings

With more ICO’s being released everyday, it’s becoming increasingly difficult to filter out scams and low-potential coins from influential ideas and high-growth coins. This guide serves to analyze the potential of ICO’s in the long term. It is not meant to find coins that will make you lots of money very quickly. Instead, it relies on […]

It’s Time: Moving Bitcoin from Coinbase to Electrum

I’m sure you’ve heard the terms UASF, UAHF, and BIP148 tossed around in the past few weeks, but what’s really going on? and what does that mean for the average Bitcoin user? I would suggest checking out http://www.uasf.co/ for full details, but effectively on August 1st, 2017 we could see the Bitcoin blockchain split into multiple […]

Beginning Solidity: Creating a Cryptocurrency Token on the Ethereum Network

If it seems like there’s a new cryptocurrency being released every day, it’s because there is. Some of the big crypto-players have set up platforms to make the creation of your own ‘cryptocurrency’ simple. For example, Ethereum-based projects like Golem and Augur. These are run on the same blockchain as Ethereum, so they’re typically referred to […]

Building a Blockchain from Scratch using Python – Part 1

It’s one thing to understand the concept of a blockchain, but it’s an entirely new level of understanding to be able to implement one. In the following series of tutorials, you will learn how to develop a very simple blockchain using Python 3. This tutorial is for people who are fairly familiar with blockchains. If you need assistance with […]

Blockchain and Python: Exploring the Python Blockchain Package

If you’ve ever wanted to interact with the Bitcoin blockchain directly, here’s your chance using Python! More specifically the python blockchain package interacts with the Blockchain.info API. Installing The Blockchain Package Assuming you have access to the terminal with pip and/or Anaconda installed, you should be ready for the blockchain package installation. Luckily, pip makes it […]

Cryptocurrency Comparison: What are the Differences Between the Major Altcoins?

A quick glance into the Top 100 on CoinMarketCap can inspire many questions. Why are there so many different cryptocurrencies? What do they all do? Which ones are important? In this article I attempt to highlight the main differences between altcoins, in order of their current market capitalizations. I have excluded Bitcoin and Ether for […]