python — StealthyCoder

Supply and demand

Sun, 25 Aug 2019 21:05:35 +0000

This post will touch on two subjects that I already talked about, Amara's Law and creativity thrives in structure. I will reiterate Amara's Law here though:

We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run. In the past few weeks I worked on refactoring three Python projects in the field of data processing and OSINT (Open-Source Intelligence) and all those projects turned exactly into the same structure so I thought I would write a piece on how I ( and a friend of mine ) reworked the code into a structure that works for three separate projects. In three different domains but all do data processing and analysis.

The structure we made followed the concepts of Suppliers and Consumers. This is not new by any means, but there a lot of patterns and sometimes people forget them. This pattern has the concept of a class being a Supplier and therefore it supplies data. What that data is, that is arbitrary, numbers, string or other classes. Then there are classes who are Consumers, they consume whatever the data is given by the Supplier and do something with it, or not. The Supplier might get some input to where to get the data, then the input for the Consumer is the output of the Supplier and maybe the output of the Consumer is the input for a new Supplier or Consumer.

There might be special types of Consumers that all do a similar action, like transforming the data in a particular way. These Consumers might collectively be called Transformers. Cue 80s cartoon music... Suppliers that get the data from a specific place might also be named as such, like a Database Supplier getting the data from a database.

I will present the whole code and then in sections refactor it into a nice structure. You can easily find who wrote the original code but it is not a gibe towards that person.

def main():
    planet = "mars"
    my_earth_weight_kg = 75  # 165lbs / 11.8st
    weights = calculate_weights(planet, my_earth_weight_kg)
    print(f"A person weighing {my_earth_weight_kg}kg on Earth would weigh:\n")
    print(f"  {weights['pounds']:>3.2f}lbs")
    print(f"  {weights['stones']:>2.2f}st")
    print(f"  {weights['kilograms']:>3.2f}kg")
    print(f"\non {planet.capitalize()}.")

def calculate_weights(planet, my_earth_weight_kg):
    planet_details = get_planet_details()
    G = 6.67408 * 10**-11
    Fg = G * ((planet_details[planet]["mass_kg"] * my_earth_weight_kg)
              / (planet_details[planet]["mean_radius_metres"] ** 2))
    weights = newtons_to_weights(Fg)
    return weights

def get_planet_details():
    planet_details = {}
    planet_details["mercury"] = {"mass_kg": 3.3011 * 10**23,
                                 "mean_radius_metres": 2439.7 * 1000}
    planet_details["venus"] = {"mass_kg": 4.8675 * 10**24,
                               "mean_radius_metres": 6051.8 * 1000}
    planet_details["earth"] = {"mass_kg": 5.97237 * 10**24,
                               "mean_radius_metres": 6371 * 1000}
    planet_details["mars"] = {"mass_kg": 6.4171 * 10**23,
                              "mean_radius_metres": 3389.5 * 1000}
    planet_details["jupiter"] = {"mass_kg": 1.8982 * 10**27,
                                 "mean_radius_metres": 69911 * 1000}
    planet_details["saturn"] = {"mass_kg": 5.6834 * 10**26,
                                "mean_radius_metres": 58232 * 1000}
    planet_details["uranus"] = {"mass_kg": 8.681 * 10**25,
                                "mean_radius_metres": 25362 * 1000}
    planet_details["neptune"] = {"mass_kg": 1.02413 * 10**26,
                                 "mean_radius_metres": 24622 * 1000}
    return planet_details

def newtons_to_weights(N):
    weights = {}
    weights["newtons"] = N
    weights["pounds"] = N / 4.4482216
    weights["stones"] = weights["pounds"] / 14
    weights["kilograms"] = N / 9.80665
    return weights

main()

Supplier

The method called get_planet_details() looks like a Supplier type to me. It supplies in this case a dictionary of planets with some properties. So first let us create a Planet class that holds data about planets that we need. If you follow along and created an empty dir with nothing in it yet then create a directory now called models . Within that models directory place two files: __init__.py and planet.py . Inside __init__.py place the following code:

class Model(object):
    pass

Inside planet.py place the following code:

from typing import Union
from models import Model

class Planet(Model):
    def __init__(self, 
                    name: str, 
                    mass: Union[float, str], 
                    radius: Union[float, str],
                    order_of_mass: Union[int, str],
                    order_of_radius: Union[int, str]):
        self.name = name
        self._mass = float(mass)
        self._radius = float(radius)
        self.order_of_mass = int(order_of_mass)
        self.order_of_radius = int(order_of_radius)

    @property
    def mass(self):
        return self._mass * 10 ** self.order_of_mass

    @property
    def radius(self):
        return self._radius * 10 ** self.order_of_radius

I put the order_of_mass and order_of_radius attributes there so we can give the unit like 3.3011 * 10**23 as two separate values the mass base unit 3.3011 and the order of magnitude 23. The @property tells Python the function can actually be accessed like a property and therefore we can give back the calculated properties for mass and radius. The Model is now only there so we can do nice type hinting

Next we need a Supplier to give back all these Planet objects. Make the following directories from the root suppliers/planet/planets . So you have two directories in the root, suppliers and models. Inside suppliers/planet create the file PlanetSupplier.py with the following contents:

import os
from config import Config
from models.planet import Planet
from suppliers.BaseSupplier import BaseSupplier
from typing import Generator

class PlanetSupplier(BaseSupplier):
    def __init__(self, config: Config):
        self.config = config
        self.planets = []
        location = self.config.supplier['location']
        for ini in sorted(os.listdir(location)):
            planet_ini = self.config.parse(open(location + "/" + ini))
            self.planets.append(Planet(**planet_ini.defaults()))
    
    def supply(self) -> Generator[Planet, None, None]:
        for planet in self.planets:
            yield planet

    def validate(config):
        assert "location" in config.supplier, "location property is needed"

Before we get into the new mentioned classes a quick word about the ideas. The BaseSupplier object is there for type hinting and the static method validate (not annotated here because it comes from BaseSupplier) is there to make sure the configuration is valid.

So we will use .ini files to handle the data representation on all the planets, they are in a location ( I wonder if you can already guess where) and the location comes from a Config object. It will then parse each ini file in that location. Optional improvement here is to use glob to only read in files that have a .ini suffix. Then it creates a Planet instance and appends it to the list.

The **planet.ini_defaults() gives back an OrderedDict so if we make the ini file have the same keys as the Planet constructor ones for the key value pairs it will automatically unpack it for us into the constructor. Regardless of order in the ini file. That is pretty handy. So next first let us take a look a the .ini file.

[DEFAULT]
name=Earth
mass=5.97237
order_of_mass=24
radius=6371
order_of_radius=3

The DEFAULT is a special keyword making the values go into the call for ini_defaults later. You might have wondered at the extra call to float in the Planet __init__ method. This is because ini files handle everything as strings, so therefore we have to convert them once. Naturally all these files will be placed in the suppliers/planet/planets/ folder. This example is called earth.ini.

Next let us take a look at the BaseSupplier. It lives in the file BaseSupplier.py inside the suppliers directory.

import abc
from models import Model
from typing import Union, Generator
from config import Config

class BaseSupplier(object):
    
    @staticmethod
    @abc.abstractmethod
    def validate(config: Config) -> None:
        pass

    @abc.abstractmethod
    def supply(self) -> Generator[Model, None, None]:
        pass

The BaseSupplier sets two methods and there you can see the Model for type hinting. The supply method should be a generator always because that allows for lazy loading and evaluation when dealing with large data sets this means the program will not be limited so much by storage and memory limits.

Next let us take a look at the Config. It lives in the root of the project in the file config.py.

import configparser


class Config(object):
    def __init__(self):
        parser = configparser.ConfigParser()
        assert len(parser.read("config.ini")) != 0, "Could not read config file"
        Config._validate(parser)
        self.supplier = parser['supplier']

    
    @staticmethod
    def _validate(parser: configparser.ConfigParser) -> None:
        assert "supplier" in parser, "Supplier section must be there"
        assert "type" in parser['supplier'], "Supplier section must have a type"
    
    @staticmethod
    def parse(fp) -> configparser.ConfigParser:
        parser = configparser.ConfigParser()
        parser.read_file(fp)
        return parser

As you can see here the Config gets instantiated with a call to a hard coded filename called config.ini. This could be improved by specifying the name in the constructor. Next we do a simple validation check and then also give a static parse method so other classes can parse .ini files without having to import and deal with parsing.

There is one more class I want to add and that is an AgnosticSupplier. I will first show the class here and this class lives in the __init__.py class inside the suppliers directory.

from config import Config
from suppliers.planet.PlanetSupplier import PlanetSupplier
from suppliers.BaseSupplier import BaseSupplier
from typing import Dict

class AgnosticSupplier(object):
    def __init__(self, config: Config):
        self.config = config
        self.provider_map: Dict[BaseSupplier] = {
            "planet": PlanetSupplier
        }

    def get_supplier(self) -> BaseSupplier:
        supplier: BaseSupplier = self.provider_map.get(self.config.supplier['type'])
        supplier.validate(self.config)
        return supplier(self.config)

So the AgnosticSupplier is here for your ease of use. You give it a Config object and on the basis of that it will give you the correct Supplier back based on the type and a simple map, which is just a dictionary of type name and class . As you can see here there is a call to supplier.validate to make sure the config is valid. Next the config.ini file itself, which is placed in the root directory:

[supplier]
type=planet
location=suppliers/planet/planets

There you see all that is needed to make sure the PlanetSupplier has all the information.

Consumer

def calculate_weights(planet, my_earth_weight_kg):
    planet_details = get_planet_details()
    G = 6.67408 * 10**-11
    Fg = G * ((planet_details[planet]["mass_kg"] * my_earth_weight_kg)
              / (planet_details[planet]["mean_radius_metres"] ** 2))
    weights = newtons_to_weights(Fg)
    return weights

def newtons_to_weights(N):
    weights = {}
    weights["newtons"] = N
    weights["pounds"] = N / 4.4482216
    weights["stones"] = weights["pounds"] / 14
    weights["kilograms"] = N / 9.80665
    return weights

Now then. We have a Supplier, a Model so now all we need is the actual operation of getting the weight and it will be inside a Consumer. A special one, a Transformer. So go ahead and make a directory called transformers/planet in the root of the project. Inside transformers/planet create the file WeightTransformer.py and put the following code in:

from models.planet import Planet
from suppliers.planet.PlanetSupplier import PlanetSupplier
from typing import Generator

class WeightTransformer(object):
    class Result(object):
        def __init__(self, newtons: float, planet: Planet):
            self.newtons = newtons
            self.planet = planet
    
        @property
        def pounds(self):
           return self.newtons / 4.4482216

        @property
        def stones(self):
            return self.pounds / 14

        @property
        def kilograms(self):
            return self.newtons / 9.80665

    def __init__(self, supplier: PlanetSupplier, weight: float):
        self.supplier = supplier
        self.weight = weight
        self.G = 6.67408 * 10**-11

    def transform(self) -> Generator[Result, None, None]:
        for planet in self.supplier.supply():
            Fg = self.G * ((planet.mass * self.weight) / (planet.radius ** 2))
            
            yield self.Result(Fg, planet)

So this will take over the piece of calculating the weights and also give an inner class of Result responsible for calculating the different forms of weight. As you can see another generator is used for the case of large data sets. It loops over all the ones from the Supplier not just one. Now all we need is to run the program again. The only thing I did not completely take over again was the formatted printing, I am just going to fully print everything. This next code can be placed inside main.py in your root directory.

from config import Config
from suppliers import AgnosticSupplier
from transformers.planet.WeightTransformer import WeightTransformer

c = Config()
supplier = AgnosticSupplier(c).get_supplier()
wt = WeightTransformer(supplier, 75)
for result in wt.transform():
    print(result.planet.name, result.kilograms)

It will print out for every planet in this case. If you just want Mars then you add an if statement.

That is it. We transformed a simple code into a nice structure.

Closing thoughts

Why this structure? Is this not overkill? This seems complicated for such a simple task. Maybe you are having these questions and thoughts right now. Remember the law you read at the beginning. The value in this structure is that I can add properties to every planet ini file and change the Planet constructor and the rest of the code will still work.

We have single point of entry in the AgnosticSupplier, so again if we change something in the PlanetSupplier overall if it will keep the supply method then no worries. If we want to add more calculations for example we add the property to the planets for distance from Earth then a simple CommunicationTimings class can take the PlanetSupplier and give you back how long it will take for a single message to reach that planet using radio, light or something else to communicate. Your WeightTransformer will still work too.

In the future we discover a new planet and it is only one .ini file away from being included in the code. You know exactly where to put it and it will get picked up. Maybe in the future there won't be a need for the ini files. Someone gives you a SQLite database file with all the data. You make a DatabaseSupplier and rewrite the PlanetSupplier a bit to get the data from the DatabaseSupplier and make it into Planets.

#code #python

Supply chains

Sun, 25 Aug 2019 22:40:30 +0000

So in the previous post I showed how to use the pattern of Suppliers and Consumers to structure any kind of input and output of data and structuring the processing. Well for bonus points we are going to make something fancy, a Chain. A Chain is nothing more than the series of steps that starts with one action and the result of that action will be the input for the next action and so on all the way till the end. Let us make one.

Chain

First off let us create a directory to house everything in chains/planet. Inside the chains directory place a __init__.py file with the following contents:

import abc
from typing import List, Dict, Callable

class SimpleChain(object):
    class Step(object):
        def __init__(self, func: Callable, pos_args: List, kw_args: Dict = None):
            self.func = func
            self.pos_args = pos_args
            self.kw_args = kw_args
        
        def execute(self):
            if len(self.pos_args) > 0 and self.kw_args is not None:
                return self.func(*self.pos_args, **self.kw_args)
            elif len(self.pos_args) > 0:
                return self.func(*self.pos_args)
            elif self.kw_args is not None:
                return self.func(**self.kw_args)
            else:
                return self.func()


    def __init__(self):
        self.steps: List[self.Step] = []

    def then(self, func: Callable, pos_args: List = [], kw_args: Dict = None):
        self.steps.append(self.Step(func, pos_args, kw_args))

    def execute(self):
        counter = 0
        for step in self.steps:
            if counter == 0:
                r = step.execute()
            else:
                step.pos_args = [r] + step.pos_args
                r = step.execute()
            counter += 1
            
        return r

The SimpleChain class has a method to add steps called then and an execute method that will execute them all. You can return the counter as well to verify all steps have been run. The inner class of Step is there to better manage everything and keep things nice and easy to comprehend. So the logic is that for each step beyond the first one we will add the results of the previous step as the first positional argument. I chose to prepend it rather than append it because we now can use unnamed functions called lambdas where it will always be the first positional argument.

Next up our PlanetWeightChain to see it in action. Create the PlanetWeightChain.py file inside the chains/planet directory with the following contents:

from chains import SimpleChain
from config import Config
from suppliers import AgnosticSupplier
from transformers.planet.WeightTransformer import WeightTransformer

class PlanetWeightChain(object):
    def __init__(self, config: Config, weight: float):
        self.chain = SimpleChain()
        self.chain.then(AgnosticSupplier, [config])
        self.chain.then(lambda x: x.get_supplier())
        self.chain.then(lambda x, y: WeightTransformer(x, y), [weight])
        self.chain.then(lambda x: x.transform())
        self.chain.then(lambda x: PlanetWeightChain.print(x))        

    def execute(self):
        self.chain.execute()

    @staticmethod
    def print(results):
        for result in results:
            print(result.kilograms, result.planet.name)

This Chain will first instantiate a SimpleChain and add the steps to it that seem awfully familiar to the main.py of the previous post. We want to pass in functions, but we don't necessarily want to make named functions for all these steps, therefore we make lambdas. The first positional argument will always be the result of the previous one. So the first step is to create an instance of AgnosticSupplier and that will be given to the second lambda and therefore x is the AgnosticSupplier instance and get_supplier will give us the supplier we need for the next step which is to create a WeightTransformer.

Remember the print at the end does not return anything so a None will be passed along if there were more steps.

Now to make our main.py slightly different:

from config import Config
from chains.planet.PlanetWeightChain import PlanetWeightChain

chain = PlanetWeightChain(Config(), 75)
chain.execute()

That is it. We have the full chain. It will automatically print the results at the end, naturally if you don't want it to print the results you could stop at the transform function and get the results back.

Closing thoughts

I really like this extra system. It seems like an overkill but this makes everything so flexible. You can insert extra steps, take an existing Chain and add or prepend steps to it wherever necessary. This system can make it so we can join the CommunicationTiming and WeightTransformer of previous post as examples into one chain and get the total data back.

#code #python

Optimizing code - Part 0

Sun, 06 Oct 2019 13:43:43 +0000

This first part will deal with setting up the problem and the first initial code.

Problem

The problem to solve is to find the first x amount of Narcissistic Numbers . What they are is easy to explain. The process by which to identify if a number is narcissistic is by following these steps:

Separate number into individual numerals
Raise each numeral to the power that is equal to the total amount of numerals (length of the number)
Sum those results together
If that sum is equal to the original number then it is narcissistic

A quick example is 153 .

Separate numerals are 1, 5 and 3.
1 ** 3 = 1, 5 ** 3 = 125 and 3 ** 3 = 27
1 + 125 + 27 = 153
153 == 153 is true

Therefore 153 is narcissistic. By looking at this algorithm we can see that all numbers with length 1 are narcissistic. Since something to the power of 1 is itself.

Naive code

The naive implementation is the following:

import sys
from datetime import datetime

def is_narcissistic(x: int) -> bool:
    str_x = str(x)
    s = sum([int(i) ** len(str_x) for i in str_x])
    return s == x

def find_narcissistic_numbers(desired: int) -> None:
    for x in range(sys.maxsize):
        if is_narcissistic(x):
            desired -= 1
        if desired == 0:
            return

start = datetime.utcnow()
find_narcissistic_numbers(28)
print(datetime.utcnow() - start)

This code will make the number we are checking into a string and by utilizing Python iterators it will iterate over each character in the string and therefore we have step 1 down. Then in the list comprehension we also raise to the power that is the length of the string (amount of numerals) and we sum it . This is step 2 and 3. Then we check the original with that sum and get step 4.

The total run time for this code on my laptop is : 0:08:09.315972

Let us look in future parts how we can optimize this code.

#code #python

Optimizing code – Part 1

Sun, 06 Oct 2019 13:57:36 +0000

The first improvement we will make is to move away from int to string to int conversion in the previous code and implement integer only operations. We can do this by using the modulo and floor integer division.

Modulo

The modulo % operator gives you when dealing with whole base 10 (decimal) numbers, in other words the Natural numbers ( non-negative integers), the digit on the most right. The operation a mod b or a % b means divide a with b and whatever remains return that value.

Examples: – 15 % 10 = 5 – 124 % 10 = 4 – 40 % 10 = 0

Floor integer

Floor integer division given a and b is divide a with b and round that result off down to the nearest whole number or integer.

Examples: – 15 // 10 = 1 – 143 // 10 = 14 – 456 // 10 = 45

As might see, by using modulo we get the last digit and with floor integer we chop that number off so we can move through the numbers from right to left.

import sys
from datetime import datetime

def is_narcissistic(x: int) -> bool:
    org = x
    digits = []
    while x > 0:
        _, x = digits.append(x % 10), x // 10

    s = sum([i ** len(digits) for i in digits])
    return s == org

def find_narcissistic_numbers(desired: int) -> None:
    for x in range(sys.maxsize):
        if is_narcissistic(x):
            desired -= 1
        if desired == 0:
            return

start = datetime.utcnow()
find_narcissistic_numbers(28)
print(datetime.utcnow() - start)

The time for this code on my laptop is: 0:07:53.349431

#code #python

Optimizing code - Part 2

Sun, 06 Oct 2019 14:00:22 +0000

A small mini optimization is to move the call to len out as the length of the number does not change.

import sys
from datetime import datetime

def is_narcissistic(x: int) -> bool:
    org = x
    digits = []
    while x > 0:
        _, x = digits.append(x % 10), x // 10
    power = len(digits)
    s = sum([i ** power for i in digits])
    return s == org

def find_narcissistic_numbers(desired: int) -> None:
    for x in range(sys.maxsize):
        if is_narcissistic(x):
            desired -= 1
        if desired == 0:
            return

start = datetime.utcnow()
find_narcissistic_numbers(28)
print(datetime.utcnow() - start)

The time for this code to run on my laptop is: 0:07:14.172538

#code #python

Optimizing code – Part 3

Sun, 06 Oct 2019 14:02:18 +0000

The next thing we can optimize is the fact that the result of the powers calculation does not change during the same length of the numbers. In order words, 3 ** 3 does not change for the numbers 123, 345, 543 and any other number containing a 3 in the range of 100 – 999.

So we make a lookup table and put all the powers in there in a nice way. If the position is the same as the value contained within then it is really easy to calculate the next power. It is just multiply the value by the index it is located at. This process is called memoization. It is when you store or cache the results of expensive calculations and just look them up the next time you need them.

import sys
from datetime import datetime

POWERS = [0,1,2,3,4,5,6,7,8,9]
LENGTH = 1

def recalculate_powers() -> None:
    global POWERS
    for i,v in enumerate(POWERS):
        POWERS[i] = i * v

def is_narcissistic(x: int) -> bool:
    global LENGTH, POWERS
    org = x
    digits = []
    total = 0
    while x > 0:
        _, x = digits.append(x % 10), x // 10
    cur_length = len(digits)
    if cur_length != LENGTH:
        LENGTH = cur_length
        recalculate_powers()
    for i in digits:
        total += POWERS[i]
    return total == org

def find_narcissistic_numbers(desired: int) -> None:
    for x in range(1,sys.maxsize):
        if is_narcissistic(x):
            desired -= 1
        if desired == 0:
            return

start = datetime.utcnow()
find_narcissistic_numbers(28)
print(datetime.utcnow() - start)

The time for this code to run on my laptop: 0:06:23.868587

#code #python

Optimizing code – Part 4

Sun, 06 Oct 2019 14:10:18 +0000

The next mini optimization we can do is to bail out earlier because when summing the number and we overshoot the original then we can stop immediately. For example 9 ** 6 is a big number and so with bigger numbers it makes sense to bail out earlier.

import sys
from datetime import datetime

POWERS = [0,1,2,3,4,5,6,7,8,9]
LENGTH = 1

def recalculate_powers() -> None:
    global POWERS
    for i,v in enumerate(POWERS):
        POWERS[i] = i * v

def is_narcissistic(x: int) -> bool:
    global LENGTH, POWERS
    org = x
    digits = []
    total = 0
    while x > 0:
        _, x = digits.append(x % 10), x // 10
    cur_length = len(digits)
    if cur_length != LENGTH:
        LENGTH = cur_length
        recalculate_powers()
    for i in digits:
        total += POWERS[i]
        if total > org:
            return False
    return total == org

def find_narcissistic_numbers(desired: int) -> None:
    for x in range(1,sys.maxsize):
        if is_narcissistic(x):
            desired -= 1
        if desired == 0:
            return

start = datetime.utcnow()
find_narcissistic_numbers(28)
print(datetime.utcnow() - start)

The time for this code to run on my laptop is: 0:06:21.202479

To compare this to the original run of the naive implementation is, which was 0:08:09.315972. Even on my slow hardware, it still resulted in a speed up of about 25%.

#code #python

Optimizing code - Part 5

Sun, 06 Oct 2019 14:21:21 +0000

So for the smart people out there they might have already figured out that the result for 153 is the same as for 135,315,351,513 and 531. This means that we can calculate the result for all of those once and just check if the result of that calculation is in that list. Which is the case for the number 153.

Online Encyclopedia Integer Sequences

There exists an online database of integer sequences. It has a lot of cool sequences and one of them is the one with all Narcissistic Numbers. It has number A005188 and you can find it here.

This next code is inspired and based on the code in the OEIS.

Code

from itertools import combinations_with_replacement
from datetime import datetime
import sys

POWERS = [0,1,2,3,4,5,6,7,8,9]
NARCISSISTIC_NUMBERS = []

def recalculate_powers():
    for i,v in enumerate(POWERS):
        POWERS[i] = i * v

def equals(number: int, target: tuple) -> bool:
    digits = []
    while number > 0:
        _, number = digits.append(number % 10), number // 10
    return tuple(sorted(digits)) == target



def find_narcissistic_numbers(desired: int) -> None:
    global POWERS, NARCISSISTIC_NUMBERS
    for k in range(1, sys.maxsize):

        for b in combinations_with_replacement(range(10), k):

            x = sum(map(lambda y:POWERS[y], b))
            if x > 0 and equals(x, b):
                NARCISSISTIC_NUMBERS.append(x)
                if len(NARCISSISTIC_NUMBERS) == desired:
                    return
        recalculate_powers()

start = datetime.utcnow()
find_narcissistic_numbers(28)
print(datetime.utcnow() - start)

The combinations_with_replacement gets numbers 0-9 and how many times to do it. So for example when k = 2 you get:

[0,0] [0,1] ... [1,1] ... [2,2] ... [9,9]

And as you can see you lose one number every time you go up in tens. As 0,1 is equal to 1,0 and 1,2 is equal to 2,1. There are considerable amount of fewer computations to check.

So this code takes this much time to run on my laptop: 0:00:00.204806

That is an insane speedup in time compared to the naive implementation and the optimized implementation.

Caveat

Small thing I found out whilst running this code. I thought it was enough to do sorted((2,1)) but this returns a list!!. So then I needed to wrap that with another tuple or the other one with list and I did not want to do that. So therefore I went with the initial list and turn that into a tuple.

#code #python

How to transport a snake?

Mon, 24 Feb 2020 10:28:33 +0000

So we all know planes are not a good idea. Before you know it you have Samuel L. Jackson shouting in your ear.

So maybe we need a few new ideas.

Point of view

As most quasi intellectuals who want to sound philosophical like to say; It all depends on your point of view. However, in this case I want to make for how to ship Python applications, I want to make sure you embody the different individuals who partake in the process of shipping it or deploying it if you want to be technical. One is the developer who writes the source code, the other is the DevOps engineer tasked with deploying it wherever it is needed.

Ecosystem

The ecosystem for Python applications is great if you are the developer that just has to write the source code to make it work. You activate a virtual environment using any tool that has your preference. You install the necessary libraries outside of the standard library that you need. You write the code until it all works. Job done.

The ecosystem for deploying it sucks. Either you use the same flow as the developer but that is not the context and environment of the DevOps engineer. You really wanted to use the packages from the package managers of the Operating System you deploying it to. You cannot however because the same versions do not sync up. So you have to choose.

Either Or

Either you use the developer tools and flows for the duration of deploying and your Continuous Integration process. Or you use the system packages to pre-built docker images where you run your code and development process. It is either or. There is a big discrepancy between the developer context and the deployment context, or if you will the end user.

Let us assume you made something and you want other people to run it too. How do you get your Python application to the end user? You might get away with making a self running archive. Though one thing needs to be there and that is the C libraries on the host machine. So you cannot get away with prepping the host machine environment.

The one solution is to statically compile everything and not dynamically link it. This means a lot of management done by developers or DevOps engineers themselves. It would solve the problem though that you can ship your archive and anyone else can just run it. Not a feasible thing to accomplish as it introduces a lot of unneeded complexity.

Two layers

There also exists the possibility of having the self running archive and then also putting it inside a docker image. These two layers can have two independent running pipelines. One to produce the archive and one that produces the image. Now you can update the runtime without changing the source code and run the risk of introducing anomalies. You take the archive out and you update the Python runtime and reinstate the archive and run it.

It gives you more fine grained control if that is needed.

#devops #python

Decoration patterns

Tue, 15 Sep 2020 20:48:59 +0000

So this post touches on some Python concepts in async/await territory. I will not cover event loops nor how to interact with them, but something I uncovered/unearthed in a quest to make something work that was synchronous only. I will use the word async which itself is a shortening of the word asynchronous.

First things first, the async structure in essence means a form of cooperative multi-threading. This is in contrast with preemptive multi-threading. The latter means there will be scheduled allotted CPU cycles where for example a function needs 3 cycles, but gets 2 at a time over the course of all work that needs to be done can actually take 6 cycles.

In cooperative model the function gets to take as many cycles with regards to the CPU scheduler though and then hand back control after 3 and hopefully that means in this case the cycles are used in a more direct and efficient way making the program smoother and more efficient.

As a side note, async should only be used as you expect to be I/O bound and also it should be used completely throughout the application and third party libraries.

So this also means it will be running in the background whatever this function is you wanted to make async, and therefore the keyword await was introduced stating you should await the function for when it is done.

Getting started

First something simple.


def f():
    pass

A simple function definition. If we wanted an async function we would do the following:


async def a():
    pass

So far only the only difference, other than the name, is the keyword async. So what happens when you print these defined symbols.

print(f) # 
print(a) #

So nothing apparently. They are both just function definitions. What happens when you print the result of executing the functions?

print(f()) # None
print(a()) #

You will also get a RuntimeWarning of a coroutine not being awaited. We leave that for what it is right now. So interestingly we see they are both functions before and after the execution the one has a result, the other a coroutine object. So we want to see if there is a way to determine if the function is a coroutine aforehand.

Let us write something that does that:

import inspect

print(inspect.iscoroutine(a)) # False
print(inspect.isawaitable(a)) # False

print(inspect.iscoroutine(a())) # True
print(inspect.isawaitable(a())) # True

Hang on, they both state False when operating on the definition. However our RuntimeWarning said we needed to await. When we execute the function we do get the information, but that is still after the fact.

We first have to execute a function in order to find out we need to await. There might be some people out there right now that go well you used a wrong method. There will be two solutions at the end.

Decorators

In Python there exists the following syntactic sugar:


def decorator(func):
    def wrapper():
       func()
    return wrapper

@decorator
async def a():
     pass

a()

This is the same as doing a = decorator(a) and then calling a() will actually execute the wrapper, as a will be now equal to wrapper.

First problem we run into in this example is we gave an async function to a non-async function. You can only await in async functions. That is easily solved:


def decorator(func):
    async def wrapper():
       await func()
    return wrapper

Now the wrapper is also async and we await the function inside. However this decorator might be used for async and non-async functions alike. We still need to determine accurately whether or not a function is async aforehand.

Internals

Looking at the internals of Python there exists the __code__ property on functions. Inside that property there is a co_flags property. That will actually hold a bitmap value of what flags the function itself holds. You can get at this information in the following way:


from dis import pretty_flags

def f():
    pass

async def a():
    pass

print(pretty_flags(f.__code__.co_flags)) # OPTIMIZED, NEWLOCALS, NOFREE
print(pretty_flags(a.__code__.co_flags)) # OPTIMIZED, NEWLOCALS, NOFREE, COROUTINE

Aha, we see now that we can determine if a function is a coroutine or not. This means we can make our decorator correctly now:


from dis import pretty_flags


def decorator(func):
    def wrapper():
        func()
    async def async_wrapper():
        await func()
    
    if "COROUTINE" in pretty_flags(func.__code__.co_flags):
        return async_wrapper
    
    return wrapper

Final Solution

I already mentioned there are two solution. I first wanted to get this internal solution out the way as that is the one I used first. Then I ran into something that made more sense. So the before mentioned decorator can also be written thusly:


import inspect


def decorator(func):
    def wrapper():
        func()
    async def async_wrapper():
        await func()
    
    if inspect.iscoroutinefunction(func):
        return async_wrapper
    
    return wrapper

The importance of using the correct method is abundantly clear in this case. Our definition is a coroutine function, not yet a coroutine and therefor you also cannot await a definition. Only an instance of the executed async function.

Hope this helps out a bit in the future of your async python adventure.

As a final final example the decorator should probably look like this:


import inspect


def decorator(func):
    def wrapper(*args, **kwargs):
        func(*args, **kwargs)
    async def async_wrapper(*args, **kwargs):
        await func(*args, **kwargs)
    
    if inspect.iscoroutinefunction(func):
        return async_wrapper
    
    return wrapper

In order to propagate any and all arguments given to the function you are decorating.

#code #python