Using Faker to generate events

Faker

A quick introduction to Faker.

Installation

In [1]:
%%sh
pip install Faker
Collecting Faker
  Using cached https://files.pythonhosted.org/packages/d4/ed/2fd5337ed405c4258dde1254e60f4e8ef9f1787576c0a2cd0d750b1716a6/Faker-2.0.3-py2.py3-none-any.whl
Requirement already satisfied: six>=1.10 in /Users/j.waterschoot/.local/share/virtualenvs/pdf-project-FE8EO07q/lib/python3.7/site-packages (from Faker) (1.12.0)
Requirement already satisfied: python-dateutil>=2.4 in /Users/j.waterschoot/.local/share/virtualenvs/pdf-project-FE8EO07q/lib/python3.7/site-packages (from Faker) (2.8.0)
Requirement already satisfied: text-unidecode==1.3 in /Users/j.waterschoot/.local/share/virtualenvs/pdf-project-FE8EO07q/lib/python3.7/site-packages (from Faker) (1.3)
Installing collected packages: Faker
Successfully installed Faker-2.0.3

Initialization

In [2]:
from faker import Faker
faker = Faker()

Implementation

Create a random integer:

In [3]:
faker.random_int(min=1, max=8, step=1)
Out[3]:
2

Run it a second time and it might give another integer:

In [4]:
faker.random_int(min=1, max=8, step=1)
Out[4]:
2

Or we can define a list with some elements and create random element with Faker:

In [5]:
characters = ["Mario", "Luigi", "Peach", "Toad"]
faker.random_element(characters)
Out[5]:
'Luigi'
In [6]:
faker.random_element(characters)
Out[6]:
'Toad'

Additionally, we can create a date between a start and end date, i.e. the last month:

In [7]:
import datetime
from dateutil.relativedelta import relativedelta

date_end = datetime.datetime.now()
date_start = date_end + relativedelta(months=-1)
In [8]:
faker.date_between_dates(date_start=date_start, date_end=date_end)
Out[8]:
datetime.date(2019, 10, 27)
In [9]:
faker.date_between_dates(date_start=date_start, date_end=date_end)
Out[9]:
datetime.date(2019, 10, 12)

Event generator

Let's put this together to make an event generator that can be used to create fake data for any other project.

In [10]:
import json

CHARACTERS = ["Mario", "Luigi", "Peach", "Toad"]
DATE_END = datetime.datetime.now()
DATE_START = DATE_END + relativedelta(months=-1)
MAX_LIVES = 100
MIN_LIVES = 1
NUM_EVENTS = 2

def _generate_events():
    """ Generate the metric data """
    for _ in range(NUM_EVENTS):
        yield {
            "character": faker.random_element(CHARACTERS),
            "lives": faker.random_int(min=MIN_LIVES, max=MAX_LIVES, step=1),
            "time": str(faker.date_between_dates(DATE_START, DATE_END)),
        }

print(json.dumps(list(_generate_events()), indent=2))
[
  {
    "character": "Mario",
    "lives": 12,
    "time": "2019-10-01"
  },
  {
    "character": "Peach",
    "lives": 78,
    "time": "2019-10-14"
  }
]

C'est tout! Please use this wiseley.