skip to navigation
skip to content

Planet Python

Last update: April 13, 2021 10:46 AM UTC

April 13, 2021


Codementor

How To Find Broken Images Using Selenium WebDriver?

Want to find broken images on your website? Here's how you can do broken image testing on your website using Selenium in Java, Python, C#, and PHP.

April 13, 2021 09:43 AM UTC


Programiz

Python Program to Safely Create a Nested Directory

In this example, you will learn to safely create a nested directory using Python.

April 13, 2021 06:32 AM UTC


Kushal Das

Workshop on writing Python modules in Rust April 2020

I am conducting 2 repeat sessions for a workshop on "Writing Python modules in Rust".

The first session is on 16th April, 1500 UTC onwards, and the repeat session will be on 18th April 0900 UTC. More details can be found in this issue.

You don't have to have any prior Rust knowledge. I will be providing working code, and we will go very slowly to have a working Python module with useful functions in it.

If you are planning to attend or know anyone else who may want to join, then please point them to the issue link.

April 13, 2021 06:05 AM UTC


Ned Batchelder

Coverage.py and third-party code

I’ve made a change to coverage.py, and I could use your help testing it before it’s released to the world.

tl;dr: install this and let me know if you don’t like the results:
pip install coverage==5.6b1

What’s changed? Previously, coverage.py didn’t understand about third-party code you had installed. With no options specified, it would measure and report on that code, for example in site-packages. A common solution was to use --source=. to only measure code in the current directory tree. But many people put their virtualenv in the current directory, so third-party code installed into the virtualenv would still get reported.

Now, coverage.py understands where third-party code gets installed, and won’t measure code it finds there. This should produce more useful results with less work on your part.

This was a bit tricky because the --source option can also specify an importable name instead of a directory, and it had to still measure that code even if it was installed where third-party code goes.

As of now, there is no way to change this new behavior. Third-party code is never measured.

This is kind of a big change, and there could easily be unusual arrangements that aren’t handled properly. I would like to find out about those before an official release. Try the new version and let me know what you find out:

pip install coverage==5.6b1

In particular, I would like to know if any of the code you wanted measured wasn’t measured, or if there is code being measured that “obviously” shouldn’t be. Testing on Debian (or a derivative like Ubuntu) would be helpful; I know they have different installation schemes.

If you see a problem, write up an issue. Thanks for helping.

April 13, 2021 02:19 AM UTC


Podcast.__init__

Let The Robots Do The Work Using Robotic Process Automation with Robocorp

One of the great promises of computers is that they will make our work faster and easier, so why do we all spend so much time manually copying data from websites, or entering information into web forms, or any of the other tedious tasks that take up our time? As developers our first inclination is to "just write a script" to automate things, but how do you share that with your non-technical co-workers? In this episode Antti Karjalainen, CEO and co-founder of Robocorp, explains how Robotic Process Automation (RPA) can help us all cut down on time-wasting tasks and let the computers do what they're supposed to. He shares how he got involved in the RPA industry, his work with Robot Framework and RPA framework, how to build and distribute bots, and how to decide if a task is worth automating. If you're sick of spending your time on mind-numbing copy and paste then give this episode a listen and then let the robots do the work for you.

Summary

One of the great promises of computers is that they will make our work faster and easier, so why do we all spend so much time manually copying data from websites, or entering information into web forms, or any of the other tedious tasks that take up our time? As developers our first inclination is to "just write a script" to automate things, but how do you share that with your non-technical co-workers? In this episode Antti Karjalainen, CEO and co-founder of Robocorp, explains how Robotic Process Automation (RPA) can help us all cut down on time-wasting tasks and let the computers do what they’re supposed to. He shares how he got involved in the RPA industry, his work with Robot Framework and RPA framework, how to build and distribute bots, and how to decide if a task is worth automating. If you’re sick of spending your time on mind-numbing copy and paste then give this episode a listen and then let the robots do the work for you.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial.
  • Software is read more than it is written, so complex and poorly organized logic slows down everyone who has to work with it. Sourcery makes those problems a thing of the past, giving you automatic refactoring recommendations in your IDE or text editor while you write (I even have it working in Emacs). It isn’t just another linting tool that nags you about issues. It’s like pair programming with a senior engineer, finding and applying structural improvements to your functions so that you can write cleaner code faster. Best of all, listeners of Podcast.__init__ get 6 months of their Pro tier for free if you go to pythonpodcast.com/sourcery today and use the promo code INIT when you sign up.
  • Your host as usual is Tobias Macey and today I’m interviewing Antti Karjalainen about the RPA Framework for automating your daily tasks and his work at Robocorp to manage your robots in production

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by giving an overview of what Robotic Process Automation is?
  • What are some of the ways that RPA might be used?
    • What are the advantages over writing a custom library or script in Python to automate a given task?
    • How does the functionality of RPA compare to automation services like Zapier, IFTTT, etc.?
  • What are you building at Robocorp and what was your motivation for starting the business?
    • Who is your target customer and how does that inform the products that you are building?
  • Can you give an overview of the state of the ecosystem for RPA tools and products and how Robocorp and RPA framework fit within it?
    • How does the RPA Framework relate to Robot Framework?
  • What are some of the challenges that developers and end users often run into when trying to build, use, or implement an RPA system?
  • How is the RPA framework itself implemented?
    • How has the design of the project evolved since you first began working on it?
  • Can you talk through an example workflow for building a robot?
  • Once you have built a robot, what are some of the considerations for local execution or deploying it to a production environment?
  • How can you chain together multiple robots?
  • What is involved in extending the set of operations available in the framework?
  • What are the available integration points for plugging a robot written with RPA Framework into another Python project?
  • What are the dividing lines between RPA Framework and Robocorp?
    • How are you handling the governance of the open source project?
  • What are some of the most interesting, innovative, or unexpected ways that you have seen RPA Framework and the Robocorp platform used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while building and growing RPA Framework and the Robocorp business?
  • When is RPA and RPA Framework the wrong choice for automation?
  • What do you have planned for the future of the framework and business?

Keep In Touch

Picks

Closing Announcements

  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

April 13, 2021 01:45 AM UTC

April 12, 2021


Paolo Amoroso

Free Python Books Went Viral on Hacker News

My Free Python Books list went viral on Hacker News, ending up on the home page within the first 2-3 entries for several hours.

Free Python Books on the home page of Hacker News
Free Python Books on the home page of Hacker News.

Mike Andreuzza shared the project’s link to Hacker News on April 10, 2021. Since then the post gathered 154 upvotes. The Free Python Books GitHub repository jumped to almost 700 stars and 80 forks (up from about 95 stars and 20 forks before), reached almost 15K views from over 8K visitors, and went trending on GitHub.

This attention brought new contributions to the project as 3 authors submitted their books and another user reported a broken link. Two people even sent me donations (thanks for the coffee!).

Plot of the views and unique visitors of Free Python Books on GitHub when featured on Hacker News
A plot of the views (green) and unique visitors (blue) of the Free Python Books GitHub repository when the project was featured on Hacker News.

Although I had interacted online with Mike a number of times, his submission to Hacker News came out of the blue and was a complete, pleasant surprise for me.

Free Python Books is a project I began when first approaching the language. Books are my preferred learning resources, so I started maintaining a list of the many good free works I run across.

Curation is another learning tool and the list is also a reference source for me.

My deepest thanks to Mike and the many users who appreciate the project. If you haven’t already, check out Free Python Books.

April 12, 2021 06:24 PM UTC


PythonClub - A Brazilian collaborative blog about Python

Orientação a objetos de outra forma: Classes e objetos

Nas poucas e raríssimas lives que eu fiz na Twitch, surgiu a ideia de escrever sobre programação orientada a objetos em Python, principalmente por algumas diferenças de como ela foi implementada nessa linguagem. Aproveitando o tema, vou fazer uma série de postagens dando uma visão diferente sobre orientação a objetos. E nessa primeira postagem falarei sobre classes e objetos.

Usando um dicionário

Entretanto, antes de começar com orientação a objetos, gostaria de apresentar e discutir alguns exemplos sem utilizar esse paradigma de programação.

Pensando em um sistema que precise manipular dados de pessoas, é possível utilizar os dicionários do Python para agrupar os dados de uma pessoa em uma única variável, como no exemplo a baixo:

pessoa = {
    'nome': 'João',
    'sobrenome': 'da Silva',
    'idade': 20,
}

Onde os dados poderiam ser acessados através da variável e do nome do dado desejado, como:

print(pessoa['nome'])  # Imprimindo João

Assim, todos os dados de uma pessoa ficam agrupados em uma variável, o que facilita bastante a programação, visto que não é necessário criar uma variável para cada dado, e quando se manipula os dados de diferentes pessoas fica muito mais fácil identificar de qual pessoa aquele dado se refere, bastando utilizar variáveis diferentes.

Função para criar o dicionário

Apesar de prático, é necessário replicar essa estrutura de dicionário toda vez que se desejar utilizar os dados de uma nova pessoa. Para evitar a repetição de código, a criação desse dicionário pode ser feita dentro de uma função que pode ser colocada em um módulo pessoa (arquivo, nesse caso com o nome de pessoa.py):

# Arquivo: pessoa.py

def nova(nome, sobrenome, idade):
    return {
        'nome': nome,
        'sobrenome': sobrenome,
        'idade': idade,
    }

E para criar o dicionário que representa uma pessoa, basta importar esse módulo (arquivo) e chamar a função nova:

import pessoa

p1 = pessoa.nova('João', 'da Silva', 20)
p2 = pessoa.nova('Maria', 'dos Santos', 18)

Desta forma, garante-se que todos os dicionários representando pessoas terão os campos desejados e devidamente preenchidos.

Função com o dicionário

Também é possível criar algumas funções para executar operações com os dados desses dicionários, como pegar o nome completo da pessoa, trocar o seu sobrenome, ou fazer aniversário (o que aumentaria a idade da pessoa em um ano):

# Arquivo: pessoa.py

def nova(nome, sobrenome, idade):
    ...  # Código abreviado


def nome_completo(pessoa):
    return f"{pessoa['nome']} {pessoa['sobrenome']}"


def trocar_sobrenome(pessoa, sobrenome):
    pessoa['sobrenome'] = sobrenome


def fazer_aniversario(pessoa):
    pessoa['idade'] += 1

E sendo usado como:

import pessoa

p1 = pessoa.nova('João', 'da Silva', 20)
pessoa.trocar_sobrenome(p1, 'dos Santos')
print(pessoa.nome_completo(p1))
pessoa.fazer_aniversario(p1)
print(p1['idade'])

Nesse caso, pode-se observar que todas as funções aqui implementadas seguem o padrão de receber o dicionário que representa a pessoa como primeiro argumento, podendo ter outros argumentos ou não conforme a necessidade, acessando e alterando os valores desse dicionário.

Versão com orientação a objetos

Antes de entrar na versão orientada a objetos propriamente dita dos exemplos anteriores, vou fazer uma pequena alteração para facilitar o entendimento posterior. A função nova será separada em duas partes, a primeira que criará um dicionário, e chamará uma segunda função (init), que receberá esse dicionário como primeiro argumento (seguindo o padrão das demais funções) e criará sua estrutura com os devidos valores.

# Arquivo: pessoa.py

def init(pessoa, nome, sobrenome, idade):
    pessoa['nome'] = nome
    pessoa['sobrenome'] = sobrenome
    pessoa['idade'] = idade


def nova(nome, sobrenome, idade):
    pessoa = {}
    init(pessoa, nome, sobrenome, idade)
    return pessoa


...  # Demais funções do arquivo

Porém isso não muda a forma de uso:

import pessoa

p1 = pessoa.nova('João', 'da Silva', 20)

Função para criar uma pessoa

A maioria das linguagens de programação que possuem o paradigma de programação orientado a objetos faz o uso de classes para definir a estrutura dos objetos. O Python também utiliza classes, que podem ser definidas com a palavra-chave class seguidas de um nome para ela. E dentro dessa estrutura, podem ser definidas funções para manipular os objetos daquela classe, que em algumas linguagens também são chamadas de métodos (funções declaradas dentro do escopo uma classe).

Para converter o dicionário para uma classe, o primeiro passo é implementar uma função para criar a estrutura desejada. Essa função deve possui o nome __init__, e é bastante similar a função init do código anterior:

class Pessoa:
    def __init__(self, nome, sobrenome, idade):
        self.nome = nome
        self.sobrenome = sobrenome
        self.idade = idade

As diferenças são que agora o primeiro parâmetro se chama self, que é um padrão utilizado no Python, e em vez de usar colchetes e aspas para acessar os dados, aqui basta utilizar o ponto e o nome do dado desejado (que aqui também pode ser chamado de atributo, visto que é uma variável do objeto). A função nova implementada anteriormente não é necessária, a própria linguagem cria um objeto e passa ele como primeiro argumento para o __init__. E assim para se criar um objeto da classe Pessoa basta chamar a classe como se fosse uma função, ignorando o argumento self e informando os demais, como se estivesse chamando a função __init__ diretamente:

p1 = Pessoa('João', 'da Silva', 20)

Nesse caso, como a própria classe cria um contexto diferente para as funções (escopo ou namespace), não está mais sendo utilizado arquivos diferentes, porém ainda é possível fazê-lo, sendo necessário apenas fazer o import adequado. Mas para simplificação, tanto a declaração da classe, como a criação do objeto da classe Pessoa podem ser feitas no mesmo arquivo, assim como os demais exemplos dessa postagem.

Outras funções

As demais funções feitas anteriormente para o dicionário também podem ser feitas na classe Pessoa, seguindo as mesmas diferenças já apontadas anteriormente:

class Pessoa:
    def __init__(self, nome, sobrenome, idade):
        self.nome = nome
        self.sobrenome = sobrenome
        self.idade = idade

    def nome_completo(self):
        return f'{self.nome} {self.sobrenome}'

    def trocar_sobrenome(self, sobrenome):
        self.sobrenome = sobrenome

    def fazer_aniversario(self):
        self.idade += 1

Para se chamar essas funções, basta acessá-las através do contexto da classe, passando o objeto criado anteriormente como primeiro argumento:

p1 = Pessoa('João', 'dos Santos', 20)
Pessoa.trocar_sobrenome(p1, 'dos Santos')
print(Pessoa.nome_completo(p1))
Pessoa.fazer_aniversario(p1)
print(p1.idade)

Essa sintaxe é bastante semelhante a versão sem orientação a objetos implementada anteriormente. Porém quando se está utilizando objetos, é possível chamar essas funções com uma outra sintaxe, informando primeiro o objeto, seguido de ponto e o nome da função desejada, com a diferença de que não é mais necessário informar o objeto como primeiro argumento. Como a função foi chamada através de um objeto, o próprio Python se encarrega de passá-lo para o argumento self, sendo necessário informar apenas os demais argumentos:

p1.trocar_sobrenome('dos Santos')
print(p1.nome_completo())
p1.fazer_aniversario()
print(p1.idade)

Existem algumas diferenças entre as duas sintaxes, porém isso será tratado posteriormente. Por enquanto a segunda sintaxe pode ser vista como um açúcar sintático da primeira, ou seja, uma forma mais rápida e fácil de fazer a mesma coisa que a primeira, e por isso sendo a recomendada.

Considerações

Como visto nos exemplos, programação orientada a objetos é uma técnica para juntar variáveis em uma mesma estrutura e facilitar a escrita de funções que seguem um determinado padrão, recebendo a estrutura como argumento, porém a sintaxe mais utilizada no Python para chamar as funções de um objeto (métodos) posiciona a variável que guarda a estrutura antes do nome da função, em vez do primeiro argumento.

No Python, o argumento da estrutura ou objeto (self) aparece explicitamente como primeiro argumento da função, enquanto em outras linguagens essa variável pode receber outro nome (como this) e não aparece explicitamente nos argumentos da função, embora essa variável tenha que ser criada dentro do contexto da função para permitir manipular o objeto.


Esse artigo foi publicado originalmente no meu blog, passe por lá, ou siga-me no DEV para ver mais artigos que eu escrevi.

April 12, 2021 06:00 PM UTC


Doug Hellmann

entry-point-inspector 0.2.1

What’s new in 0.2.1? update python support versions to drop python 2 support and include 3.7, 3.8, and 3.9 Fix packaging issue by requiring setuptools for installation (contributions by Miro Hrončok) fix linter issue in setup.py add github actions to run CI jobs

April 12, 2021 04:25 PM UTC


John Cook

Spaceship operator in Python

Some programming languages, such as Perl, have an infix operator <=> that returns a three-state comparison. The expression

    a <=> b

evaluates to -1, 0, or 1 depending on whether a < b, a = b, or a > b. You could think of <=> as a concatenation of <, =, and >.

The <=> operator is often called the “spaceship operator” because it looks like Darth Vader’s ship in Star Wars.

Python doesn’t have a spaceship operator, but you can get the same effect with numpy.sign(a-b). For example, suppose you wanted to write a program to compare two integers.

You could write

    from numpy import sign
    def compare(x, y):
        cmp = ["equal to", "greater than", "less than"][sign(x-y)]
        print(f"{x} is {cmp} {y}.")

Here we take advantage of the fact that an index of -1 points to the last element of a list.

The sign function will return an integer if its argument is an integer or a float if its argument is a float. The code above will break if you pass in floating point numbers because sign will return -1.0, 0.0, or 1.0. But if you replace sign(x-y) with int(sign(x-y)) it will work for floating point arguments.

Related post: Symbol pronunciation

The post Spaceship operator in Python first appeared on John D. Cook.

April 12, 2021 03:36 PM UTC


death and gravity

Looking to improve by reading code? Some great examples from the Python standard library

So, you're an advanced beginner – you've learned your way past Python basics and can solve real problems.

You've now moved past tutorials and blog posts; maybe you feel they offer one-dimensional solutions to simple, made-up problems; maybe instead of solving this specific problem, you want to get better at solving problems in general.

Maybe you heard you should develop an eye by reading and writing a lot of code.

It's true.

So, what code should you read?


"Just read what you like."

What if you don't know what you like?

What if you don't like the right thing? Or worse, what if you like the wrong thing, and get stuck with bad habits because of it?

After all, you have to have an eye for that...

...but that's what you're trying to develop in the first place.


"There are so many projects on GitHub – pick one you like and see how they did it."

But most successful projects are quite large; where do you start from?

And even if you knew where to start, how they did it isn't always obvious. Yes, the code is right there, but it doesn't really tell you why they did it, what they didn't do, nor how they thought about the whole thing.

In other words, it is not obvious from the code itself what the design philosophy was and what choices were considered before settling on an implementation.

In this article, we'll look at some standard library modules where it is.

A note about the standard library #

As a whole, the Python standard library isn't great for learning "good" style.

While all the modules are useful, they're not very uniform:

On the other hand, the newer modules are more consistent, have detailed PEPs explaining the design tradeoffs, and some took inspiration from already mature third party libraries.

It's a few of the latter ones we'll look at.

Style aside, there's a lot to learn from the standard library, since it solves real problems for a diverse population of developers.

It's interesting/educative to look at the differences between stdlib stuff and newer external alternatives – the shows a perceived deficiency in the standard library (otherwise they wouldn't have bothered with the new thing); an example of this is urllib vs. requests.

How to read these #

Roughly in this order:

dataclasses #

The dataclasses module reduces the boilerplate of writing classes by generating special methods like __init__ and __repr__. (See this tutorial for an introduction that has more concrete examples than the official documentation.)

It was introduced in PEP 557, as a simpler version of attrs. The Specification section is similar to the documentation; the good stuff is in Rationale, Discussion, and Rejected Ideas.

The code is extremely well commented; particularly interesting is this use of decision tables (ASCII version, nested if version).

It is also a good example of metaprogramming. Raymond Hettinger's Dataclasses: The code generator to end all code generators talk looks at dataclasses with a focus on the code generation aspects (HTML slides, PDF slides).

pathlib #

The pathlib module provides a simple hierarchy of classes to handle filesystem paths; it is a higher level alternative to os.path.

It was introduced in PEP 428. Most of the examples serve to illustrate the underlying philosophy, with the code left as specification.

The code is a good read for a few reasons:

statistics #

The statistics module adds statistical functions to the standard library; it's not intended to be a competitor libraries like NumPy, but is rather "aimed at the level of graphing and scientific calculators".

It was introduced in PEP 450. Even if you are not familiar with the subject matter, it is a very interesting read:

The documentation is also very nice. This is by design; as the proposal says: "Plenty of documentation, aimed at readers who understand the basic concepts but may not know (for example) which variance they should use [...] But avoid going into tedious mathematical detail."

The code is relatively simple, and when it's not, there are comments and links to detailed explanations or papers. You may find this useful if you're just learning about this stuff and find it easier to read code than maths notation.

Bonus: graphlib #

graphlib was added in Python 3.9, and at the moment contains just one thing: an implementation of a topological sort algorithm (here's a refresher on what it is and how it's useful).

This doesn't come with a PEP; it does however have an issue with lots of comments from various core developers, including Raymond Hettinger and Tim Peters (of Zen of Python fame).

Since this is essentially a solved problem, most of the discussion focuses on the API instead: where to put it, what to call it, how to represent the input and the output, how to make it easy to use and flexible at the same time.

One thing they're trying to do is reconcile two diferent use cases:

Unlike with PEPs, you can see the solution evolving as you read. Most enhancement proposals summarize the main other choices as well, but if you don't follow the mailing list links it's easy to get the impression they just appear, fully formed.

Compared to the discussion, the code itself is tiny – just under 250 lines, mostly comments and documentation.


That's it for now.

If you found this useful, please consider sharing it on Reddit or anywhere else :)

April 12, 2021 02:55 PM UTC


Real Python

Start Contributing to Python: Your First Steps

If you want to start contributing to open source, then Python is a great project to start with. You’ll not only be making your mark on one of the biggest projects out there, but you’ll also be doing it as part of a vibrant and welcoming community. Open source projects rely on contributions from volunteers like you to grow and evolve, so you’ll be making a real difference to the future of open source software.

On top of that, contributing to open source is a great way to learn and build your skills, so don’t worry if you don’t feel like an expert. There may be a way to contribute that’s perfect for you, even if you don’t know about it yet. It all starts with your first contribution!

By the end of this tutorial, you’ll know:

  • How you can contribute in a way that matches your skills and interests
  • What resources and tools you can use to help you contribute confidently
  • Where you can find ideas for fixes to propose in your first contribution

Free Download: Get a sample chapter from CPython Internals: Your Guide to the Python 3 Interpreter showing you how to unlock the inner workings of the Python language, compile the Python interpreter from source code, and participate in the development of CPython.

How You Can Contribute

Depending on your interests and skills, you can contribute in a number of different ways. For example, if you want to contribute to CPython, you can:

But if you want to contribute in other areas, you can:

You can also help review pull requests from other contributors. The core developers have a lot of work on their hands, so if you can help move some issues forward, then you’ll be helping Python get better faster.

How to Get the Resources You’ll Need

When you start contributing to an open source project, there can be a lot of information to take in all at once.

Read the full article at https://realpython.com/start-contributing-python/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

April 12, 2021 02:00 PM UTC


Cusy

Python Pattern Matching in Linux Magazine 05/2021

Python Pattern Matching in Linux Magazine 05/2021

Linux Magazin 05/2021

The originally object-oriented programming language Python is to receive a new feature in version 3.10, which is mainly known from functional languages: pattern matching. The change is controversial in the Python community and has triggered a heated debate.

Pattern matching is a symbol-processing method that uses a pattern to identify discrete structures or subsets, e.g. strings, trees or graphs. This procedure is found in functional or logical programming languages where a match expression is used to process data based on its structure, e.g. in Scala, Rust and F#. A match statement takes an expression and compares it to successive patterns specified as one or more cases. This is superficially similar to a switch statement in C, Java or JavaScript, but much more powerful.

Python 3.10 is now also to receive such a match expression. The implementation is described in PEP (Python Enhancement Proposal) 634. [1] Further information on the plans can be found in PEP 635 [2] and PEP 636 [3]. How pattern matching is supposed to work in Python 3.10 is shown by this very simple example, where a value is compared with several literals:

def http_error(status):
      match status:
          case 400:
              return "Bad request"
          case 401:
              return "Unauthorized"
          case 403:
              return "Forbidden"
          case 404:
              return "Not found"
          case 418:
              return "I'm a teapot"
          case _:
              return "Something else"

In the last case of the match statement, an underscore _ acts as a placeholder that intercepts everything. This has caused irritation among developers because an underscore is usually used in Python before variable names to declare them for internal use. While Python does not distinguish between private and public variables as strictly as Java does, it is still a very widely used convention that is also specified in the Style Guide for Python Code [4].

However, the proposed match statement can not only check patterns, i.e. detect a match between the value of a variable and a given pattern, it also rebinds the variables that match the given pattern.

This leads to the fact that in Python we suddenly have to deal with Schrödinger constants, which only remain constant until we take a closer look at them in a match statement. The following example is intended to explain this:

NOT_FOUND = 404
retcode = 200

match retcode:
    case NOT_FOUND:
        print('not found')

print(f"Current value of {NOT_FOUND=}")

This results in the following output:

not found
Current value of NOT_FOUND=200

This behaviour leads to harsh criticism of the proposal from experienced Python developers such as Brandon Rhodes, author of «Foundations of Python Network Programming»:

If this poorly-designed feature is really added to Python, we lose a principle I’ve always taught students: “if you see an undocumented constant, you can always name it without changing the code’s meaning.” The Substitution Principle, learned in algebra? It’ll no longer apply.

— Brandon Rhodes on 12 February 2021, 2:55 pm on Twitter [5]

Many long-time Python developers, however, are not only grumbling about the structural pattern-matching that is to come in Python 3.10. They generally regret developments in recent years in which more and more syntactic sugar has been sprinkled over the language. Original principles, as laid down in the Zen of Python [6], would be forgotten and functional stability would be lost.

Although Python has defined a sophisticated process with the Python Enhancement Proposals (PEPs) [7] that can be used to collaboratively steer the further development of Python, there is always criticism on Twitter and other social media, as is the case now with structural pattern matching. In fact, the topic has already been discussed intensively in the Python community. The Python Steering Council [8] recommended adoption of the Proposals as early as December 2020. Nevertheless, the topic only really boiled up with the adoption of the Proposals. The reason for this is surely the size and diversity of the Python community. Most programmers are probably only interested in discussions about extensions that solve their own problems. The other developments are overlooked until the PEPs are accepted. This is probably the case with structural pattern matching. It opens up solutions to problems that were hardly possible in Python before. For example, it allows data scientists to write matching parsers and compilers for which they previously had to resort to functional or logical programming languages.

With the adoption of the PEP, the discussion has now been taken into the wider Python community. Incidentally, Brett Cannon, a member of the Python Steering Council, pointed out in an interview [9] that the last word has not yet been spoken: until the first beta version, there is still time for changes if problems arise in practically used code. He also held out the possibility of changing the meaning of _ once again.

So maybe we will be spared Schrödinger’s constants.


[1]PEP 634: Specification
[2]PEP 635: Motivation and Rationale
[3]PEP 636: Tutorial
[4]https://pep8.org/#descriptive-naming-styles
[5]@brandon_rhodes
[6]PEP 20 – The Zen of Python
[7]Index of Python Enhancement Proposals (PEPs)
[8]Python Steering Council
[9]Python Bytes Episode #221

April 12, 2021 12:57 PM UTC


Stack Abuse

Borůvka's Algorithm in Python - Theory and Implementation

Introduction

Borůvka's Algorithm is a greedy algorithm published by Otakar Borůvka, a Czech mathematician best known for his work in graph theory. Its most famous application helps us find the minimum spanning tree in a graph.

A thing worth noting about this algorithm is that it's the oldest minimum spanning tree algorithm, on record. Borůvka came up with it in 1926, before computers as we know them today even existed. It was published as a method of constructing an efficient electricity network.

In this guide, we'll take a refresher on graphs, and what minimum spanning trees are, and then jump into Borůvka's algorithm and implement it in Python:

Graphs and Minimum Spanning Trees

A graph is a abstract structure that represents a group of certain objects called nodes (also known as vertices), in which certain pairs of those nodes are connected or related. Each one of these connections is called an edge.

A tree is an example of a graph:

Graph basic example

In the image above, the first graphs has 4 nodes and 4 edges, while the second graph (a binary tree) has 7 nodes and 6 edges.

Graphs can be applied to many problems, from geospatial locations to social network graphs and neural networks. Conceptually, graphs like these are all around us. For example, say we'd like to plot a family tree, or explain to someone how we met our significant other. We might introduce a large number of people and their relationships to make the story as interesting to the listener as it was to us.

Since this is really just a graph of people (nodes) and their relationships (edges) - graphs are a great way to visualize this:

visualizing relationships with graphs

Types of Graphs

Depending on the types of edges a graph has, we have two distinct categories of graphs:

An undirected graph is a graph in which the edges do not have orientations. All edges in an undirected graph are, therefore, considered bidirectional.

Formally, we can define an undirected graph as G = (V, E) where V is the set of all the graph's nodes, and E is a set that contains unordered pairs of elements from E, which represent edges.

Unordered pairs here means that the relationship between two nodes is always two-sided, so if we know there's an edge that goes from A to B, we know for sure that there's an edge that goes from B to A.

A directed graph is a graph in which the edges have orientations.

Formally, we can define a directed graph as G = (V, E) where V s the set of all the graph's nodes, and E is a set that contains ordered pairs of elements from E.

Ordered pairs imply that the relationship between two nodes can be either one or two-sided. Meaning that if there's an edge that goes from A to B, we can't know if there's an edge that goes from B to A.

The direction of an edge is denoted with an arrow. Keep in mind that two-sided relationships can be shown either by drawing two distinct arrows or just drawing two arrow points on either side of the same edge:

directed and undirected graphs

Another way to differentiate graphs based on their edges is regarding the weight of those edges. Based on that, a graph can be:

A weighted graph is a graph in which every edge is assigned a number - its weight. These weights can represent the distance between nodes, capacity, price et cetera, depending on the problem we're solving.

Weighted graphs are used pretty often, for example in problems where we need to find the shortest or, as we will soon see, in problems in which we have to find a minimum spanning tree.

An unweighted graph does not have weights on its edges.

Note: In this article, we will focus on undirected, weighted graphs.

A graph can also be connected and disconnected. A graph is connected if there is a path (which consists of one or more edges) between each pair of nodes. On the other hand, a graphs is disconnected if there is a pair of nodes which can't aren't connected by a path of edges.

connected and disconnected graphs

Trees and Minimum Spanning Trees

There's a fair bit to be said about trees, subgraphs and spanning trees, though here's a really quick and concise breakdown:

Note: In case all edge weights in a graph are distinct, the minimum spanning tree of that graph is going to be unique. However, if the edge weights are not distinct, there can be multiple minimum spanning trees for only one graph.

Now that we're covered in terms of graph theory, we can tackle the algorithm itself.

Borůvka's Algorithm

The idea behind this algorithm is pretty simple and intuitive. We mentioned before that this was a greedy algorithm.

When an algorithm is greedy, it constructs a globally "optimal" solution using smaller, locally optimal solutions for smaller subproblems. Usually, it converges with a good-enough solution, since following local optimums doesn't guarantee a globally optimum solution.

Simply put, greedy algorithms make the optimal choice (out of currently known choices) at each step of the problem, aiming to get to the overall most optimal solution when all of the smaller steps add up.

You could think of greedy algorithms as a musician who's improvising at a concert and will in every moment play what sounds the best. On the other hand, non-greedy algorithms are more like a composer, who'll think about the piece they're about to perform, and take their time to write it out as sheet music.

Now, we will break down the algorithm in a couple of steps:

  1. We initialize all nodes as individual components.
  2. We initialize the minimum spanning tree S as an empty set that'll contain the solution.
  3. If there is more than one component:
    • Find the minimum-weight edge that connects this component to any other component.
    • If this edge isn't in the minimum spanning tree S, we add it.
  4. If there is only one component left, we have reached the end of the tree.

This algorithm takes a connected, weighted and undirected graph as an input, and its output is the graph's corresponding minimum spanning tree.

Let's take a look at the following graph and find its minimum spanning tree using Borůvka's algorithm:

minimum spanning tree graph

At the start, every represents an individual component. That means that we will have 9 components. Let's see what the smallest weight edges that connect these components to any other component would be:

Component Smallest weight edge that connects it to some other component Weight of the edge
{0} 0 - 1 4
{1} 0 - 1 4
{2} 2 - 4 2
{3} 3 - 5 5
{4} 4 - 7 1
{5} 3 - 5 10
{6} 6 - 7 1
{7} 4 - 7 1
{8} 7 - 8 3
Now, our graph is going to be in this state:

applying boruvka for minimum spanning tree

The green edges in this graph represent the edges that bind together its closest components. As we can see, now we have three components: {0, 1}, {2, 4, 6, 7, 8} and {3, 5}. We repeat the algorithm and try to find the minimum-weight edges that can bind together these components:

Component Smallest weight edge that connects it to some other component Weight of the edge
{0, 1} 0 - 6 7
{2, 4, 6, 7, 8} 2 - 3 6
{3, 5} 2 - 3 6

Now, our graph is going to be in this state:

boruvka's algorithm mst

As we can see, we are left with only one component in this graph, which represents our minimum spanning tree! The weight of this tree is 29, which we got after summing all of the edges:

boruvka algorithm mst

Now, the only thing left to do is implement this algorithm in Python.

Implementation

We are going to implement a Graph class, which will be the main data structure we'll be working with. Let's start off with the constructor:

class Graph:
    def __init__(self, num_of_nodes):
        self.m_v = num_of_nodes
        self.m_edges = []
        self.m_component = {}

In this constructor, we provided the number of nodes in the graph as an argument, and we initialized three fields:

Now, let's make a helper function that we can use to add an edge to a graph's nodes:

    def add_edge(self, u, v, weight):
        self.m_edges.append([u, v, weight])

This function is going to add an edge in the format [first, second, edge weight] to our graph.

Because we want to ultimately make a method that unifies two components, we'll first need a method that propagates a new component throughout a given component. And secondly, we'll need a method that finds the component index of a given node:

    def find_component(self, u):
        if self.m_component[u] == u:
            return u
        return self.find_component(self.m_component[u])

    def set_component(self, u):
        if self.m_component[u] == u:
            return
        else:
            for k in self.m_component.keys():
                self.m_component[k] = self.find_component(k)

In this method, we will artificially treat the dictionary as a tree. We ask whether or not we've found the root of our component (because only root components will always point to themselves in the m_component dictionary). If we haven't found the root node, we recursively search the current node's parent.

Note: The reason we don't assume that m_components points to the correct component is because when we start unifying components, the only thing that we know for sure won't change its component index is the root components.

For example, in our graph in the example above, in the first iteration, the dictionary is going to look like this:

index value
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8

We've got 9 components, and each member is the component by itself. In the second iteration, it's going to look like this:

index value
0 0
1 0
2 2
3 3
4 2
5 3
6 7
7 4
8 7

Now, tracing back to the roots, we'll see that our new components will be: {0, 1}, {2, 4, 7, 6, 8} and {3, 5}.

The last method we're going to need before implementing the algorithm itself is the method that unifies two components into one, given two nodes which belong to their respective components:

    def union(self, component_size, u, v):
        if component_size[u] <= component_size[v]:
            self.m_component[u] = v
            component_size[v] += component_size[u]
            self.set_component(u)

        elif component_size[u] >= component_size[v]:
            self.m_component[v] = self.find_component(u)
            component_size[u] += component_size[v]
            self.set_component(v)

        print(self.m_component)

In this function, we find the roots of components for two nodes (which are their component indexes at the same time). Then, we compare the components in terms of size, and attached the smaller one to the larger one. Then, we just add the size of the smaller one to the size of the larger one, because they are now one component.

Finally, if the components are of same size, we just unite them together however we want - in this particular example we did it by adding the second one to the first one.

Now that we've implemented all the utility methods we need, we can finally dive into Borůvka's algorithm:

    def boruvka(self):
        component_size = []
        mst_weight = 0

        minimum_weight_edge = [-1] * self.m_v

        for node in range(self.m_v):
            self.m_component.update({node: node})
            component_size.append(1)

        num_of_components = self.m_v

        print("---------Forming MST------------")
        while num_of_components > 1:
            for i in range(len(self.m_edges)):

                u = self.m_edges[i][0]
                v = self.m_edges[i][1]
                w = self.m_edges[i][2]

                u_component = self.m_component[u]
                v_component = self.m_component[v]

                if u_component != v_component:
                    if minimum_weight_edge[u_component] == -1 or \
                            minimum_weight_edge[u_component][2] > w:
                        minimum_weight_edge[u_component] = [u, v, w]
                    if minimum_weight_edge[v_component] == -1 or \
                            minimum_weight_edge[v_component][2] > w:
                        minimum_weight_edge[v_component] = [u, v, w]

            for node in range(self.m_v):
                if minimum_weight_edge[node] != -1:
                    u = minimum_weight_edge[node][0]
                    v = minimum_weight_edge[node][1]
                    w = minimum_weight_edge[node][2]

                    u_component = self.m_component[u]
                    v_component = self.m_component[v]

                    if u_component != v_component:
                        mst_weight += w
                        self.union(component_size, u_component, v_component)
                        print("Added edge [" + str(u) + " - "
                              + str(v) + "]\n"
                              + "Added weight: " + str(w) + "\n")
                        num_of_components -= 1

            minimum_weight_edge = [-1] * self.m_v
        print("----------------------------------")
        print("The total weight of the minimal spanning tree is: " + str(mst_weight))

The first thing we did in this algorithm was initialize additional lists we would need in the algorithm:

Then, we go through all of the edges in the graph, and we find the root of components on both sides of those edges.

After that, we are looking for the minimum weight edge that connects these two components using a couple of if clauses:

After we've found the cheapest edges for each component, we add them to the minimum spanning tree, and decrease the number of components accordingly.

Finally, we reset the list of minimum weight edges back to -1, so that we can do all of this again. We keep iterating as long as there are more than one component in the list of components.

Let's put the graph we used in the example above as the input of our implemented algorithm:

g = Graph(9)
g.add_edge(0, 1, 4)
g.add_edge(0, 6, 7)
g.add_edge(1, 6, 11)
g.add_edge(1, 7, 20)
g.add_edge(1, 2, 9)
g.add_edge(2, 3, 6)
g.add_edge(2, 4, 2)
g.add_edge(3, 4, 10)
g.add_edge(3, 5, 5)
g.add_edge(4, 5, 15)
g.add_edge(4, 7, 1)
g.add_edge(4, 8, 5)
g.add_edge(5, 8, 12)

Chucking it in the algorithm's implementation will result in:

---------Forming MST------------
{0: 1, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8}
Added edge [0 - 1]
Added weight: 4

{0: 1, 1: 1, 2: 4, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8}
Added edge [2 - 4]
Added weight: 2

{0: 1, 1: 1, 2: 4, 3: 5, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8}
Added edge [3 - 5]
Added weight: 5

{0: 1, 1: 1, 2: 4, 3: 5, 4: 4, 5: 5, 6: 6, 7: 4, 8: 8}
Added edge [4 - 7]
Added weight: 1

{0: 1, 1: 1, 2: 4, 3: 5, 4: 4, 5: 5, 6: 4, 7: 4, 8: 8}
Added edge [6 - 7]
Added weight: 1

{0: 1, 1: 1, 2: 4, 3: 5, 4: 4, 5: 5, 6: 4, 7: 4, 8: 4}
Added edge [7 - 8]
Added weight: 3

{0: 4, 1: 4, 2: 4, 3: 5, 4: 4, 5: 5, 6: 4, 7: 4, 8: 4}
Added edge [0 - 6]
Added weight: 7

{0: 4, 1: 4, 2: 4, 3: 4, 4: 4, 5: 4, 6: 4, 7: 4, 8: 4}
Added edge [2 - 3]
Added weight: 6

----------------------------------
The total weight of the minimal spanning tree is: 29

The time complexity of this algorithm is O(ElogV), where E represents the number of edges, while V represents the number of nodes.

The space complexity of this algorithm is O(V + E), since we have to keep a couple of lists whose sizes are equal to the number of nodes, as well as keep all the edges of a graph inside of the data structure itself.

Conclusion

Even though Borůvka's algorithm is not as well known as some other minimum spanning tree algorithms like Prim's or Kruskal's minimum spanning tree algorithms, it gives us pretty much the same result - they all find the minimum spanning tree, and the time complexity is approximately the same.

One advantage that Borůvka's algorithm has compared to the alternatives is that it doesn't need to presort the edges or maintain a priority queue in order to find the minimum spanning tree. Even though that doesn't help its complexity, since it still passes the edges logE times, it is a bit more simple to code.

April 12, 2021 12:30 PM UTC


Zato Blog

Understanding WebSocket API timeouts

Zato WebSocket channels let you accept long-running API connections and, as such, they have a few settings to fine tune their usage of timeouts. Let's discover what they are and how to use them.

WebSocket channels

Zato Dashboard WebSocket menu

Zato Dashboard WebSocket channel creation form

The four timeout settings are listed below. All of the WebSocket clients using a particular channel will use the same timeouts configuration - this means that a different channel is needed if particular clients require different settings.

Tokens

Ping messages

A note about firewalls

A great advantage of using WebSocket connections is that they are bidirectional and let one easily send messages to and from clients using the same TCP connection over a longer time.

However, particularly in the relation to ping messages, it needs to be remembered that stateful firewalls in data centers may have their requirements as to how often peers should communicate. This is especially true if the communication is over the Internet rather than in the same data center.

On one hand, this means that the ping interval should be set to a value small enough to ensure that firewalls will not break connections in a belief that Zato does not have anything more to send. Yet, it should not be too small lest, with a huge number of connections, the overhead of pings becomes too burdensome. For instance, pinging each client once a second is almost certainly too much and usually 20-40 seconds are a much better choice.

On the other hand, firewalls may also require the side which initiated the TCP connection (i.e. the WebSocket client) to periodically send some data to keep the connection active, otherwise the firewalls will drop the connection. This means that clients should be also possibly configured to send ping messages and how often they should do it may depend on what the applicable firewalls expect - otherwise, with only Zato pinging the client, it may not be enough for firewalls to understand that a connection is still active.

Python code

Finally, it is worth to keep in mind that all the timeouts, TTLs and pings are managed by the platform automatically and there is no programming needed for them to work.

For instance, the service below, once assigned to a WebSocket channel, will focus on the business functionality rather than on low-level management of timeouts - in other words, there is no additional code required.


# -*- coding: utf-8 -*-

# Zato
from zato.server.service import Service

class MyService(Service):
    def handle(self):
        self.logger.info('My request is %s', self.request.input)

Next steps

April 12, 2021 10:35 AM UTC


Mike Driscoll

PyDev of the Week: Will McGugan

This week we welcome Will McGugan (@willmcgugan) as our PyDev of the Week! Will is the author of the Rich package, which is for rich text and beautiful formatting in the terminal. If you have a moment, you should check out Will’s blog. Will is also the author of Beginning Game Development with Python and Pygame. You can see what other projects he contributes to over on GitHub.

Let’s spend some time getting to know Will better!

Can you tell us a little about yourself (hobbies, education, etc):

I grew up in a small town in North East Scotland. My career took me around the UK, including some years in Oxford and London. I’ve since returned to Scotland where I live in Edinburgh with my wife. I’m quite fortunate to have been working from home as a freelance software developer long before the pandemic started.

I’m mostly self-taught, having dropped out of University to work in video games. Although I think by the time you reach my age all developers are self-taught. In such a fast-moving industry learning on the job is a must.

My main hobby outside software development is photography—in particular, wildlife photography. I once spent a night in a Finnish forest shooting wild Eurasian bears. That was quite an experience! As soon as the world returns to normal I plan to do way more traveling and photography.

I post many of my photographs on my blog and if you prompt me I’ll talk at length about focal lengths and bokeh.

Why did you start using Python?

I discovered Python back in the early 2000s when I worked in video games. I was looking for a scripting language I could compile in to a game engine to manage the game mechanics while C++ handled the heaving lifting and graphics. I considered Python, Ruby, and Lua. After some research and experimentation, I settled on not Python, but Lua.

Lua was probably the best choice for that task but I found myself turning to Python for scripts and tools. I viewed Python then as more of an upgrade to Windows batch files and not as a real programming language. Only when the scripts I was writing grew more sophisticated did I begin to appreciate the expressiveness of Python and the batteries included approach. It was a refreshing change from C++ where so much had to be written from scratch.

Fast forward a few years and I made the switch to working with Python full-time, writing a chess interface for the Internet Chess Club. Python has been the focus of my career ever since, and even though I spent years learning C++, I don’t regret the switch!

What other programming languages do you know and which is your favorite?

The main other languages I use day-to-day is Javascript and Typescript (if that counts as another language), often in the context of a web application with a backend written in Python.

It’s been a while but did a lot of work with C and C++ back in the day. I also wrote a fair amount of 80×86 assembly language, at a time where hand-tuning instructions was a sane thing to do.

My favorite language is of course Python. I love the language itself and the ecosystem that has grown around it.

What projects are you working on now?

My side project is currently Rich, a library for fancy terminal rendering. I’ll talk more about that later.

My day job has me building technology for dataplicity.com, which is a remote administration tool targeted at the Raspberry Pi single-board computer (but works with any Linux). Other than the front-end, the stack is entirely running on Python.

Which Python libraries are your favorite (core or 3rd party)?

I’m a huge fan of asyncio as a lot of work I do requires concurrency of some sort or another. I’ve used Twisted and Tornado to do very similar things in the past, but asyncio with the async and await keywords have made for a much more pleasant experience. Related, is aiohttp a web framework on top of asyncio which I’ve used in the day job to build a highly-scalable websocket server.

Two libraries I like right now are PyDantic and Typer. I really like the way they use typing to create objects that can be statically checked with Mypy and related tools. The authors are pioneers and I hope to see more of this approach in the future!

How did the Rich package come about?

Some time ago my side-project was a web application framework called Moya. While building the command line interface for Moya I put together a “Console” class which turned out to be the prototypal version of Rich. The Moya Console class was not terribly well thought out and hard to separate from the main project, but there were some really good ideas there and I always thought I should build a standalone version of it.

I would revisit this idea in my mind every time I struggled to read some ugly badly formatted terminal output (often implemented by myself). I wished that this Uber-console I was formulating in my head already existed fully-fledged and documented, but it wasn’t going to write itself. Sometime in late 2019, I started work on it.

The core features came together quite quickly. Incidentally, I was in Wuhan, China just a few weeks before the pandemic hit when the first rich output was generated (naturally a bold magenta underlined blinking “Hello, World!”).

The first core feature was rich text, which is where the package name was derived. I could associate styles with a range of characters in a string, much like the way you can markup text in HTML. And that marked-up string could then be further manipulated while preserving the styles. That one feature made so many others possible, like syntax highlighting and markdown rendering.

The v1.0.0 release came out in May 2020 and it really took off, way more than I was expecting. There were bugs of course and plenty of feedback. I had intended to leave it there and maybe just maintain it for a while, but there were so many good suggestions for features that I kept working on it. At the moment I’m considering more TUI (Text User Interface) features to make terminal-based applications.

What are Rich’s strengths and weaknesses?

The main strength is probably the composability of the renderables (renderable is my term for anything that generates output in the terminal). For instance, a table cell may contain a panel (or any other renderable) that may itself contain another table, carrying on ad-infinitum, or at least until you run out of characters. It’s a model that allows you to quickly create elegant formatting in the terminal more like a web page than a stream of characters. One user even wrote his CV (résumé) using Rich, and it looks great!

One weakness may be the emoji support. Everyone loves emoji in terminal output. But what many don’t realize is that terminal support for emojis is spotty. Not all terminals display emojis with the same width, so the very same output may look neat on one terminal but have broken alignment on another and, to make matters worse, there is no way for Rich to detect how emoji are rendered.

Is there anything else you’d like to say?

I post Python-related articles on my blog (https://www.willmcgugan.com) from time to time. I’m @willmcgugan on twitter.

Thanks for doing the interview, Will!

The post PyDev of the Week: Will McGugan appeared first on Mouse Vs Python.

April 12, 2021 05:05 AM UTC


Wingware

Wing Python IDE 7.2.9 - April 12, 2021

Wing 7.2.9 adds remote development for 64-bit Raspberry Pi, improves auto-closing of quotes, optimizes change tracking when large numbers of project files change at once, improves debugger data display for some value types, and makes a number of other usability improvements.

See the change log for details.

Download Wing 7.2.9 Now: Wing Pro | Wing Personal | Wing 101 | Compare Products


What's New in Wing 7.2


Wing 7.2.9 Screen Shot

Auto-Reformatting with Black and YAPF (Wing Pro)

Wing 7.2 adds support for Black and YAPF for code reformatting, in addition to the previously available built-in autopep8 reformatting. To use Black or YAPF, they must first be installed into your Python with pip, conda, or other package manager. Reformatting options are available from the Source > Reformatting menu group, and automatic reformatting may be configured in the Editor > Auto-reformatting preferences group.

See Auto-Reformatting for details.

Improved Support for Virtualenv

Wing 7.2 improves support for virtualenv by allowing the command that activates the environment to be entered in the Python Executable in Project Properties, Launch Configurations, and when creating new projects. The New Project dialog now also includes the option to create a new virtualenv along with the new project, optionally specifying packages to install.

See Using Wing with Virtualenv for details.

Support for Anaconda Environments

Similarly, Wing 7.2 adds support for Anaconda environments, so the conda activate command can be entered when configuring the Python Executable and the New Project dialog supports using an existing Anaconda environment or creating a new one along with the project.

See Using Wing with Anaconda for details.

And More

Wing 7.2 also introduces support for Python 3.9, adds How-Tos for using Wing with AWS and PyXLL, makes it easier to debug modules with python -m, adds support for Python 3 enums, supports remote development to 64-bit Raspberry Pi, simplifies manual configuration of remote debugging, allows using a command line for the configured Python Executable, supports constraining Find Uses of imported symbols to only the current file, improves vi mode, allows folding .pyi and .pi files, improves Debug I/O process management, enhances accuracy of some types of code warnings, supports remote development without SSH tunnels, improves debugger data display for some value types, improves support for recent macOS versions, and makes a number of usability and stability improvements.

For a complete list of new features in Wing 7, see What's New in Wing 7.


Try Wing 7.2 Now!


Wing 7.2 is an exciting new step for Wingware's Python IDE product line. Find out how Wing 7.2 can turbocharge your Python development by trying it today.

Downloads: Wing Pro | Wing Personal | Wing 101 | Compare Products

See Upgrading for details on upgrading from Wing 6 and earlier, and Migrating from Older Versions for a list of compatibility notes.

April 12, 2021 01:00 AM UTC


Chris Hager

PDFx update and new version release (v1.4.1)


PDFx is a tool to extract text, links, references and metadata from PDF files and URLs. Thanks to several contributors the project received a thorough update and was brought into 2021. The new release of today is PDFx v1.4.1 🎉

PDFx works like this:

April 12, 2021 12:00 AM UTC

April 11, 2021


BreadcrumbsCollector

How to use code coverage in Python with pytest?

Basics

What is code coverage?

In the simplest words, code coverage is a measure of exhaustiveness of a test suite. 100% code coverage means that a system is fully tested.

Why bother about code coverage in Python?

Theoretically, the higher code coverage is, the fewer defects a system has. Of course, tests are not enough to catch all kinds of errors, but in this uneven battle, we need all the help we can get.

From a very mechanical perspective, the codebase is composed of individual lines. Hence, a simple formula for code coverage would be (number_of_code_lines_run_at_least_once_under_tests / total_number_of lines) * 100%. It is only at first sight that this formula looks reasonable. It’s way too insufficient. For the purpose of this article, consider the following piece of code:

from dataclasses import dataclass


@dataclass
class Patient:
    age: int
    is_pregnant: bool = False
    is_regular_blood_donor: bool = False


def determine_queue_position(patient, queue):
    # initially, we assume that a patient will just join queue
    position = len(queue)

    # there are certain groups of patients that are served without
    # having to wait in a queue
    if patient.is_pregnant or patient.is_regular_blood_donor:
        position = 0

    return position

Why focusing on just covering lines is not enough

Now, let’s assume we have a test for that:

def test_pregnancy_means_accessing_doctor_without_having_to_wait():
    queue = &#91;Patient(age=25), Patient(age=44)]
    patient = Patient(age=28, is_pregnant=True)

    queue_position = determine_queue_position(patient, queue)

    assert queue_position == 0

This test exercises EVERY line of determine_queue_position function. According to our initial definition, we were able to get 100% code coverage with a single test. Yet this minimal test suite can be hardly called exhaustive! For example, we haven’t tested against such patients:

etc. Not to mention cases like a queue with one or more patients pregnant or being regular blood donor (the latter is not covered by implementation, so we won’t be focusing on it anyway).

Types of code coverage

While the original definition of code coverage is still valid (a measure of exhaustiveness of a test suite), turns out there is a tricky part. Namely, how to assess if a test suite is actually exhaustive?

Statement coverage

We already know that a naive approach with measuring executed lines of code won’t cut it. On the bright side, it is the simplest one to understand. It is formally called line or statement coverage. This one is used by default in the most complete python code coverage lib – coverage.py.

Assuming we have code in func.py and tests in test_func.py files, we can see coverage.py (+pytest-cov plugin) reports 100% code coverage:

pytest --cov func

============================ test session starts =============================
platform darwin -- Python 3.9.0, pytest-6.2.3, py-1.10.0, pluggy-0.13.1
rootdir: /Users/spb/Projects/private/bloggo/coverr
plugins: cov-2.11.1
collected 1 item

test_func.py .                                                         &#91;100%]

---------- coverage: platform darwin, python 3.9.0-final-0 -----------
Name      Stmts   Miss  Cover
-----------------------------
func.py      11      0   100%
-----------------------------
TOTAL        11      0   100%


============================= 1 passed in 0.04s ==============================

If statement coverage is so superficial, what are better alternatives?

Branch coverage

While code indeed is composed out of lines, our execution is rarely sequential from the top to the bottom. This is because of if-statements (and similar mechanisms) that steer how the execution flows. When there is decision-making whether to do one or another thing, we call it branching. Respectively, possible code paths are called branches.

This leads us to another type of code coverage – branch coverage. It is defined as (number_of_branches_executed_at_least_once_under_tests / all_branches) * 100%. This gives us a better idea about uncovered scenarios:

pytest --cov func --cov-branch --cov-report term-missing

============================ test session starts =============================
platform darwin -- Python 3.9.0, pytest-6.2.3, py-1.10.0, pluggy-0.13.1
rootdir: /Users/spb/Projects/private/bloggo/coverr
plugins: cov-2.11.1
collected 1 item

test_func.py .                                                         &#91;100%]

---------- coverage: platform darwin, python 3.9.0-final-0 -----------
Name      Stmts   Miss Branch BrPart  Cover   Missing
-----------------------------------------------------
func.py      11      0      4      1    93%   17-&gt;20
-----------------------------------------------------
TOTAL        11      0      4      1    93%


============================= 1 passed in 0.05s ==============================

Branch coverage told us that we miss an if-statement at line 17th evaluates to False and the next executed line is return position. Covering it is a matter of testing with a patient that’s not pregnant neither a regular blood donor:

def test_not_pregnat_teenager_not_being_blood_donor_has_to_wait_in_queue():
    queue = &#91;Patient(age=15), Patient(age=33)]
    patient = Patient(age=13)

    queue_position = determine_queue_position(patient, queue)

    assert queue_position == 2

Running test suite again shows we are now fully covered (at least in terms of branch coverage):

pytest --cov func --cov-branch --cov-report term-missing
============================ test session starts =============================
platform darwin -- Python 3.9.0, pytest-6.2.3, py-1.10.0, pluggy-0.13.1
rootdir: /Users/spb/Projects/private/bloggo/coverr
plugins: cov-2.11.1
collected 2 items

test_func.py ..                                                        &#91;100%]

---------- coverage: platform darwin, python 3.9.0-final-0 -----------
Name      Stmts   Miss Branch BrPart  Cover   Missing
-----------------------------------------------------
func.py      11      0      4      0   100%
-----------------------------------------------------
TOTAL        11      0      4      0   100%


============================= 2 passed in 0.05s ==============================

How about other test scenarios? Python code coverage still has no clue we haven’t tested a regular blood donor.

Condition coverage

While branch coverage nicely catches if we missed specific paths of execution, it’s indifferent to specific conditions. You certainly remember that for example or is evaluated lazily – if an expression on the left side is true, then the one on the right side is not even evaluated.

# when is_pregnant is True, then the second part won't be executed!
if patient.is_pregnant or patient.is_regular_blood_donor:
    ...

Condition coverage assumes that in order to achieve 100% code coverage, the test suite needs to check situations in which every expression is True and False. It means condition coverage will require us to:

A formula for this type of coverage could be (number_of_executed_bool_states_of_operands / number_of_all_operands * 2) * 100%.

Unfortunately, there is no maintained tool in Python that will measure it for you. There was a lib called instrumental but it seems it has been abandoned for years.

On the other hand, we can resort to hypothesis (property-based testing lib) to help us generate exhaustive use cases. This would be especially helpful for more of a black-box testing without looking into guts of tested function (white-box testing).

Other types of coverage

Statement-, Branch- and Condition coverage are not all types of code coverage. If you are hungry for more, see several white papers linked at the end of the article.

Installation & configuration

I am assuming you are using pytest.

Now, if you’re new to coverage and want to get your hands dirty you can get some coverage numbers fast if you:

Regarding configuration, we would certainly want to enable branch coverage. We can do this (+ few other options) using e.g. setup.cfg:

&#91;coverage:run]
branch = True
omit = src/db/env.py,src/db/versions/*  # define paths to omit

&#91;coverage:report]
show_missing = True
skip_covered = True

Good practices

When to run pytest with code coverage enabled?

During build (Continous Integration)

Running tests with coverage should definitely happen during builds, e.g. on Jenkins, Travis or whatever tool you use. We should set some required threshold for coverage. When it’s not met (code coverage less than expected) we fail the build, e.g. pytest –cov=src/ –cov-fail-under=100 tests/. In this example, the command will fail if our coverage is lower than 100%.

Locally

Just like during Continous Integration, you can instrument pytest to run coverage plugin by manually appending appropriate parameters. The other option is to configure pytest to always collect coverage when it runs by using addopts configuration in e.g. setup.cfg:

&#91;tool:pytest]
addopts = --cov src/

Personally, I advise against the second option. Why? Because collecting code coverage in Python is a considerate performance hit. If you (or anyone in your team) is using Test-First approach, then extra latency becomes an annoyance. Usually, I run small parts of the test suite when working locally in TDD cycle and then manually run the whole test suite at the end with code coverage enabled.

How much code coverage is enough?

In theory the higher code coverage, the better. I think it makes no sense to set it at 80% or 90%. I think 100% is possible with a “BUT”.

The stance on code coverage that my colleague Łukasz taught me is that one should start with 100% requirement and then exclude lines where it is not possible to achieve code coverage. It can be done using # pragma: no cover comment. For example, coverage will complain about abstract base classes, which is obviously a nonsense:

class ApiFactory(abc.ABC):
    @abc.abstractmethod
    def foo_api(self) -&gt; FooApi:  # pragma: no cover
        pass

    @abc.abstractmethod
    def bar_api(self) -&gt; BarApi:  # pragma: no cover
        pass

There is also an option to set excluded lines in configuration of coverage.py but it’s not ideal.

Of course, the rule of 100% test coverage must be loosened in codebases where code coverage wasn’t measured before. Even then it makes sense to set the expectation high. Initially, we can also exclude parts of the code.

Is 100% code coverage an intolerable burden?

Does a pursue for 100% code coverage mean writing tests for every function/class/module? NoNo. This is a widely held myth. If function A uses function B, then to cover both of them testing function A can be sufficient. That will largely depend on their implementation, but in general, our code is organized hierarchically, forming levels of abstraction. Then measuring code coverage is an immense help to quickly show us which parts we missed.

Testing each and every code block individually is unreasonable. It effectively makes code immutable and tests very fragile. We should be starting from higher-level tests, adding low-level ones when necessary (and code coverage will give you a great hint when you need it!). Also, be aware of encapsulation and not violating it during testing.

Summary

When one wants to truly lean on their test suite, code coverage is an indispensable thing.

Although 100% code coverage may look like an unattainable goal, in my opinion, it is the only expectation that works. It really clicks when combined with TDD.

Further reading

The post How to use code coverage in Python with pytest? appeared first on Breadcrumbs Collector.

April 11, 2021 06:49 PM UTC


Quansight Labs Blog

A step towards educating with Spyder

As a community manager in the Spyder team, I have been looking for ways of involving more users in the community and making Spyder useful for a larger number of people. With this, a new idea came: Education.

For the past months, we have been wondering with the team whether Spyder could also serve as a teaching-learning platform, especially in this era where remote instruction has become necessary. We submitted a proposal to the Essential Open Source Software for Science (EOSS) program of the Chan Zuckerberg Initiative, during its third cycle, with the idea of providing a simple way inside Spyder to create and share interactive tutorials on topics relevant to scientific research. Unfortunately, we didn’t get this funding, but we didn’t let this great idea die.

We submitted a second proposal to the Python Software Foundation from which we were awarded $4000. For me, this is the perfect opportunity for us to take the first step towards using Spyder for education.

Read more… (2 min remaining to read)

April 11, 2021 02:00 PM UTC

April 10, 2021


Weekly Python StackOverflow Report

(cclxx) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2021-04-10 18:59:38 GMT


  1. Spyder 5 missing dependencies - spyder_kernels version error - [23/1]
  2. Why does pandas "None | True" return False when Python "None or True" returns True? - [9/1]
  3. __eq__() called multiple times instead of once in nested data structure - [7/1]
  4. Transform a Pandas Dataframe Column and Index to Values - [6/3]
  5. Optional arguments without a default value - [6/2]
  6. Partial of a class coroutine isn't a coroutine. Why? - [6/1]
  7. Numpy matrix multiplication but instead of multiplying it XOR's elements - [5/3]
  8. In Pytorch, is there a difference between (x<0) and x.lt(0)? - [5/2]
  9. Create column based on date conditions, but I get this error AttributeError: 'SeriesGroupBy' object has no attribute 'sub'? - [5/1]
  10. indexing rows and columns in numpy - [4/3]

April 10, 2021 07:00 PM UTC

April 09, 2021


TestDriven.io

Django vs. Flask in 2021: Which Framework to Choose

In this article, we'll look at the best use cases for Django and Flask along with what makes them unique, from an educational and development standpoint.

April 09, 2021 10:28 PM UTC


Learn PyQt

PyQt6 Book now available: Create GUI Applications with Python & Qt6 — The hands-on guide to making apps with Python

Hello! Today I have released the first PyQt6 edition of my book Create GUI Applications, with Python & Qt6.

This update follows the 4th Edition of the PyQt5 book updating all the code examples and adding additional PyQt6-specific detail. The book contains 600+ pages and 200+ complete code examples taking you from the basics of creating PyQt applications to fully functional apps.

To celebrate the milestone, the book is available this week with 20% off. As with earlier editions, readers get access to all future updates for free -- so it's a great time to snap it up! You'll also get a copy of the PyQt5, PySide6 and PySide2 editions.

PyQt6 book cover

If you bought a previous edition of the book (for PyQt5, PySide2 or PySide6) you get this update for free! Just log into your account on LearnPyQt and you'll find the book already waiting for you under "My Books & Downloads".

If you have any questions or difficulty getting hold of this update, just get in touch.

Enjoy!

See the complete PyQt5 tutorial, from first steps to complete applications with Python & Qt5.

April 09, 2021 03:00 PM UTC


Quansight Labs Blog

PyTorch TensorIterator Internals - 2021 Update

For contributors to the PyTorch codebase, one of the most commonly encountered C++ classes is TensorIterator. TensorIterator offers a standardized way to iterate over elements of a tensor, automatically parallelizing operations, while abstracting device and data type details.

In April 2020, Sameer Deshmukh wrote a blog article discussing PyTorch TensorIterator Internals. Recently, however, the interface has changed significantly. This post describes how to use the current interface as of April 2021. Much of the information from the previous article is directly copied here, but with updated API calls and some extra details.

Read more… (8 min remaining to read)

April 09, 2021 02:00 PM UTC


Real Python

The Real Python Podcast – Episode #55: Getting Started With Refactoring Your Python Code

Do you think it's time to refactor your Python code? What should you think about before starting this task? This week on the show, we have Brendan Maginnis and Nick Thapen from Sourcery. Sourcery is an automated refactoring tool that integrates into your IDE and suggests improvements to your code.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

April 09, 2021 12:00 PM UTC


Python for Beginners

Convert a list containing float numbers to string in Python

There may be situations where we want to convert a list containing float numbers to string while working in python. In this article, we will look at different ways to convert the list containing float numbers to a string which contains the elements of the list as space separated sub strings.

Important functions for the task to convert a list containing float numbers to string

To convert a list of floating point numbers to string, we will need several string methods which we will discuss first and then perform the operations.

str() function

The str() function converts an integer literal,a float literal or any other given  input to a string literal and returns the string literal of the input after conversion. This can be done as follows.

float_num=10.0
str_num=str(float_num)

join() method

join() method is invoked on a separator and an iterable object of strings is passed to the method as input. It joins each string in the iterable object with the separator and returns a new string. 

The syntax for join() method is as separator.join(iterable) where separator can be any character and iterable can be a list or tuple etc. This can be understood as follows.


str_list=["I","am","Python","String"]
print("List of strings is:")
print(str_list)
separator=" "
str_output=separator.join(str_list)
print("String output is:")
print(str_output)

Output:

List of strings is:
['I', 'am', 'Python', 'String']
String output is:
I am Python String

map() function

The map() function takes as input a function(functions are first class objects and can be passed as a parameter in python) and an iterable as argument and executes the function on each element of the iterable object and returns  the output map object which can be converted into any iterable.

The syntax for map() function is map(function,iterable). This can be understood as follows.


float_list=[10.0,11.2,11.7,12.1]
print("List of floating point numbers is:")
print(float_list)
str_list=list(map(str,float_list))
print("List of String numbers is:")
print(str_list)

Output:

List of floating point numbers is:
[10.0, 11.2, 11.7, 12.1]
List of String numbers is:
['10.0', '11.2', '11.7', '12.1']

Now we will see how to convert a list containing floating point numbers to string using the above functions.

Convert a list containing float numbers to string using for loop

We can convert the list of float numbers to string by declaring an empty string and then performing string concatenation to add the elements of the list to the string as follows.


float_list=[10.0,11.2,11.7,12.1]
print("List of floating point numbers is:")
print(float_list)
float_string=""
for num in float_list:
    float_string=float_string+str(num)+" "
    
print("Output String is:")
print(float_string.rstrip())

Output:

List of floating point numbers is:
[10.0, 11.2, 11.7, 12.1]
Output String is:
10.0 11.2 11.7 12.1

 In the above program, the output string contains an extra space at the end which has to be removed using rstrip() method.To avoid this additional operation, We can use the join() method instead of performing concatenation operation by adding the strings to create the output string as follows.


float_list=[10.0,11.2,11.7,12.1]
print("List of floating point numbers is:")
print(float_list)
float_string=""
for num in float_list:
    float_string=" ".join([float_string,str(num)])
print("Output String is:")
print(float_string.lstrip())

Output

List of floating point numbers is:
[10.0, 11.2, 11.7, 12.1]
Output String is:
10.0 11.2 11.7 12.1

In the above method, an extra space is added at left of the output string which has to be removed using lstrip() method. To avoid this,Instead of applying str() function on every element of the list, we can use map() function  to convert the list of float numbers to a list of strings and then perform the string concatenation using join() method to get the output string as follows.

float_list=[10.0,11.2,11.7,12.1]
print("List of floating point numbers is:")
print(float_list)
str_list=list(map(str,float_list))
print("List of String numbers is:")
print(str_list)
float_string=""
for num in float_list:
    float_string=" ".join(str_list)
    
print("Output String is:")
print(float_string)

Output:

List of floating point numbers is:
[10.0, 11.2, 11.7, 12.1]
List of String numbers is:
['10.0', '11.2', '11.7', '12.1']
Output String is:
10.0 11.2 11.7 12.1

Convert a list containing float numbers to string using list comprehension

Instead of for loop, we can use list comprehension to perform the conversion of a list of floating point numbers to string as follows. We can use the join() method with list comprehension as follows.

float_list=[10.0,11.2,11.7,12.1]
print("List of floating point numbers is:")
print(float_list)
str_list=[str(i) for i in float_list]
print("List of floating string numbers is:")
print(str_list)
float_string=" ".join(str_list)
print("Output String is:")
print(float_string)

Output:

List of floating point numbers is:
[10.0, 11.2, 11.7, 12.1]
List of floating string numbers is:
['10.0', '11.2', '11.7', '12.1']
Output String is:
10.0 11.2 11.7 12.1

Conclusion

In this article, we have seen how to convert a list containing floating point numbers into string using different string methods like str(), join() and for loop or list comprehension in python. Stay tuned for more informative articles.

The post Convert a list containing float numbers to string in Python appeared first on PythonForBeginners.com.

April 09, 2021 11:58 AM UTC