skip to navigation
skip to content

Planet Python

Last update: April 17, 2022 01:40 AM UTC

April 17, 2022


The Python Coding Blog

Debugging Python Code Is Like Detective Work — Let’s Investigate

Debugging Python code is not a mysterious art form. It’s like a detective solving a mystery. This analogy comes from one of my favourite programming aphorisms: “Debugging is like being the detective in a crime movie where you are also the murderer” (Felipe Fortes).

So what can real detectives tell us about debugging Python code? I thought of looking up some guidelines that police use when investigating a crime. Here are the areas detectives work on when investigating a crime scene according to the College of Policing in the UK:

[Source: https://www.app.college.police.uk/app-content/investigations/forensics/]

Let’s look at all of these and find their counterparts in debugging Python code.

I’ll use the code below as an example throughout this article. This code has a list of dictionaries with books about detectives and crimes, of course! Each item includes the author, title, year published, and the book’s rating on Goodreads:

books = [
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "A Study in Scarlet",
        "published": 1887,
        "rating": 4.14,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Sign of Four",
        "published": 1890,
        "rating": 3.92,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Hound of the Baskervilles",
        "published": 1901,
        "rating": 4.13,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Murder of the Orient Express (Hercule Poirot #4)",
        "published": 1926,
        "rating": 4.26,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Death on the Nile (Hercule Poirot #17)",
        "published": 1937,
        "rating": 4.12,
    },
]

def find_by_author(books_list, last_name):
    """Find books by author's last name"""
    # Note, you could use list comprehensions, but I'm using
    # long form for loop to make debugging easier
    for book in books_list:
        output = []
        if book["author"] == last_name:
            output.append(book)
    return output

def find_by_rating(books_list, lower_bound):
    """Find books with a rating higher than lower_bound"""
    output = []
    for book in books_list:
        if book["rating"] == lower_bound:
            output.append(book)
    return output

doyle_books = find_by_author(books, "Doyle")
doyle_books_above_4 = find_by_rating(doyle_books, 4)

print(doyle_books_above_4)

There are two functions, too. One finds the books written by a specific author, and the other filters books based on their rating. The two calls at the end should result in all Arthur Conan Doyle books with a rating higher than 4. However, as you’ll see soon, there’s a problem.

Let’s start going through the areas listed in the College of Policing document.

Prove That A Crime Has Been Committed

You need to determine whether there’s something that doesn’t work in your program. Sometimes, this is obvious. Either an error is raised when you run your code, or the output from your code is clearly wrong.

But often, the bug in your code is not obvious.

You need to be on the lookout for potential crimes in the same way that police forces are on the lookout (or should be) for crimes.

This is why testing your code is crucial. Now, there are different ways of testing your code, depending on the scale and extent of the code and what its purpose is. However, whatever the code, you always need to test it somehow.

This testing will allow you to determine that a crime has been committed—there’s a bug somewhere!

The output of the code I showed you above is the following:

[]

In this case, it’s not too difficult to determine that there is indeed a crime that’s been committed. In the short list of books, you can see two out of the three Arthur Conan Doyle books have a rating above 4. The code should have output these two books.

Before you send in your complaints that the last name should be Conan Doyle and not Doyle, please note that I’ve referred to the font of all the world’s truth on this matter: Wikipedia! See Arthur Conan Doyle.

Establish the identity of a victim, suspect or witness

Who’s the victim? I can see how that’s important for a detective trying to solve a crime.

When debugging Python code, you’ll need to understand the problem. If your code raises an error, the victim is shown in red writing in your console. If your code doesn’t raise an error, but your testing shows there’s a problem, you’ll need to be clear about what the problem is. How is the output you get different from the output you were expecting?

As you go through the debugging process, you’ll need to identify who the suspects are. Which lines of your code could be the ones which committed the crime? I’ll talk more about how to deal with suspects later, and how to exclude them or keep them in consideration. But before you can do either of those two things, you’ll need to identify a line of code as a suspect!

You also have witnesses in your code. Often, these are the variables containing data: what are the values of the data and what type of data are they? Before you can interrogate the witnesses, you’ll need to identify them!

Corroborate Or Disprove Witness Accounts

How do you interrogate witnesses to get accurate witness accounts? You’ve probably watched as much crime drama on TV as I have, so I’ll skip what detectives do in real-world crimes. Besides, I strongly suspect (!) real police interrogations are a lot less exciting than those we see on TV.

How do you interrogate the witnesses in your code? You ask the witnesses (variables) for the values they hold and what data types they are. You can do this with the humble print()using print(witness_variable) and print(type(witness_variable)). Or you can use whatever debugging tool you want. A big part of debugging Python code is looking at the variables’ values and data types.

Programmers have one advantage over detectives. Witnesses never lie! Once you ask a variable to give up its value and data type, it will always tell you the truth!

Let’s start our investigation into the crime in the code above. You can start from the first function call find_by_author(books, "Doyle"). This takes us to the function definition for find_by_author().

Could the for loop statement have any issues? Is this line a suspect? Let’s ask the witnesses:

books = [
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "A Study in Scarlet",
        "published": 1887,
        "rating": 4.14,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Sign of Four",
        "published": 1890,
        "rating": 3.92,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Hound of the Baskervilles",
        "published": 1901,
        "rating": 4.13,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Murder of the Orient Express (Hercule Poirot #4)",
        "published": 1926,
        "rating": 4.26,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Death on the Nile (Hercule Poirot #17)",
        "published": 1937,
        "rating": 4.12,
    },
]

def find_by_author(books_list, last_name):
    """Find books by author's last name"""
    # Note, you could use list comprehensions, but I'm using
    # long form for loop to make debugging easier
    print(f"{books_list = }")
    for book in books_list:
        print(f"{book = }")
        output = []
        if book["author"] == last_name:
            output.append(book)
    return output

def find_by_rating(books_list, lower_bound):
    """Find books with a rating higher than lower_bound"""
    output = []
    for book in books_list:
        if book["rating"] == lower_bound:
            output.append(book)
    return output

doyle_books = find_by_author(books, "Doyle")
doyle_books_above_4 = find_by_rating(doyle_books, 4)

print(f"{doyle_books_above_4 = }")

You’ve interrogated the witnesses books_list and book as these witnesses were present on the crime scene when the line was executed. You’re using the print() function as your forensic tool along with the f-string with an = at the end. This use of the f-string is ideal for debugging!

The output looks like this:

books_list = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}, {'author': ('Agatha', 'Christie'), 'title': 'Murder of the Orient Express (Hercule Poirot #4)', 'published': 1926, 'rating': 4.26}, {'author': ('Agatha', 'Christie'), 'title': 'Death on the Nile (Hercule Poirot #17)', 'published': 1937, 'rating': 4.12}]
book = {'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}
book = {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}
book = {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}
book = {'author': ('Agatha', 'Christie'), 'title': 'Murder of the Orient Express (Hercule Poirot #4)', 'published': 1926, 'rating': 4.26}
book = {'author': ('Agatha', 'Christie'), 'title': 'Death on the Nile (Hercule Poirot #17)', 'published': 1937, 'rating': 4.12}
doyle_books_above_4 = []

Exclude A Suspect From A Scene

You’ve seen earlier how you need to be identifying suspects as you go through your code step-by-step.

For each line of code you identify as a suspect, you interrogate the witnesses. You can exclude this line of code from your list of suspects if the witness account corroborates what the line is meant to do.

Let’s look at the output from the last version of the code above, when you asked for witness statements from books_list and book in find_by_author().

The first output is what’s returned by print(f"{books_list = }"). This includes all the books in the original list. It’s what you expect from this variable. So far, this witness statement hasn’t led you to suspect this line of code!

The remaining outputs are the return values of print(f"{book = }") which is in the for loop. You expected the loop to run five times as there are five items in the list books. You note that there are five lines output, and they each show one of the books in the list.

It seems that the for statement can be excluded as a suspect.

You can remove the two calls to print() you added.

Link A Suspect With A Scene

However, if the witness account doesn’t exonerate the suspect, you’ll need to leave that line on the list of suspects for the time being. You’ve linked the suspect with the scene of the crime.

Back to our code above. You can move your attention to the if statement in the definition of find_by_author(). You’ve already determined that the variable book contains what you expect. You can look for a clue to help you determine whether the if statement line is a suspect by checking when code in the if block is executed:

books = [
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "A Study in Scarlet",
        "published": 1887,
        "rating": 4.14,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Sign of Four",
        "published": 1890,
        "rating": 3.92,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Hound of the Baskervilles",
        "published": 1901,
        "rating": 4.13,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Murder of the Orient Express (Hercule Poirot #4)",
        "published": 1926,
        "rating": 4.26,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Death on the Nile (Hercule Poirot #17)",
        "published": 1937,
        "rating": 4.12,
    },
]

def find_by_author(books_list, last_name):
    """Find books by author's last name"""
    # Note, you could use list comprehensions, but I'm using
    # long form for loop to make debugging easier
    for book in books_list:
        output = []
        if book["author"] == last_name:
            print(f"{book = }")
            output.append(book)
    return output

def find_by_rating(books_list, lower_bound):
    """Find books with a rating higher than lower_bound"""
    output = []
    for book in books_list:
        if book["rating"] == lower_bound:
            output.append(book)
    return output

doyle_books = find_by_author(books, "Doyle")
doyle_books_above_4 = find_by_rating(doyle_books, 4)

print(f"{doyle_books_above_4 = }")

The output from this investigation is just the empty list returned by the final print() in the code:

doyle_books_above_4 = []

Therefore, the print(f"{book = }") call you’ve just added never happened. This puts suspicion on the line containing the if statement.

You need to call the forensics team:

books = [
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "A Study in Scarlet",
        "published": 1887,
        "rating": 4.14,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Sign of Four",
        "published": 1890,
        "rating": 3.92,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Hound of the Baskervilles",
        "published": 1901,
        "rating": 4.13,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Murder of the Orient Express (Hercule Poirot #4)",
        "published": 1926,
        "rating": 4.26,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Death on the Nile (Hercule Poirot #17)",
        "published": 1937,
        "rating": 4.12,
    },
]

def find_by_author(books_list, last_name):
    """Find books by author's last name"""
    # Note, you could use list comprehensions, but I'm using
    # long form for loop to make debugging easier
    for book in books_list:
        output = []
        print(f'{book["author"] = }\n{last_name = }')
        if book["author"] == last_name:
            output.append(book)
    return output

def find_by_rating(books_list, lower_bound):
    """Find books with a rating higher than lower_bound"""
    output = []
    for book in books_list:
        if book["rating"] == lower_bound:
            output.append(book)
    return output

doyle_books = find_by_author(books, "Doyle")
doyle_books_above_4 = find_by_rating(doyle_books, 4)

print(f"{doyle_books_above_4 = }")

The witnesses that were at the crime scene when the if statement was there are book["author"] and last_name. These are the objects being compared using the equality operator == in the if statement. So, the forensics team decide to print these out just before the if statement. This is the forensics team’s result:

book["author"] = ('Arthur Conan', 'Doyle')
last_name = 'Doyle'
book["author"] = ('Arthur Conan', 'Doyle')
last_name = 'Doyle'
book["author"] = ('Arthur Conan', 'Doyle')
last_name = 'Doyle'
book["author"] = ('Agatha', 'Christie')
last_name = 'Doyle'
book["author"] = ('Agatha', 'Christie')
last_name = 'Doyle'
doyle_books_above_4 = []

And there you are! You’ve found evidence that clearly links the if statement with the crime scene! The value of book["author"] is a tuple. The author’s last name is the second item in this tuple but the if statement incorrectly tries to compare the whole tuple with the last name.

All you need to do is add an index in the if statement:

if book["author"][1] == last_name:

You’ve solved the mystery. But, are you sure? When you run the code now, once you remove the print() call you used for debugging, the output is still the empty list.

Interpret The Scene In Relation To Movements Within The Scene And Sequences Of Events

Looking at a single suspect line of code in isolation is not sufficient. You need to follow how the data is being manipulated on that line and the lines before and after it.

This is the only way to investigate what has really happened during the crime.

Let’s look at the whole for loop in the definition of find_by_author() again.

You’ve already interrogated book["author"] and last_name. You can even interrogate book["author"][1] just to be sure. If you do so, you’ll see that its account seems to make sense.

The other witness on the scene is the list output. You can interrogate output at the end of the for loop:

books = [
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "A Study in Scarlet",
        "published": 1887,
        "rating": 4.14,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Sign of Four",
        "published": 1890,
        "rating": 3.92,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Hound of the Baskervilles",
        "published": 1901,
        "rating": 4.13,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Murder of the Orient Express (Hercule Poirot #4)",
        "published": 1926,
        "rating": 4.26,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Death on the Nile (Hercule Poirot #17)",
        "published": 1937,
        "rating": 4.12,
    },
]

def find_by_author(books_list, last_name):
    """Find books by author's last name"""
    # Note, you could use list comprehensions, but I'm using
    # long form for loop to make debugging easier
    for book in books_list:
        output = []
        if book["author"][1] == last_name:
            output.append(book)
        print(f"{output = }")
    return output

def find_by_rating(books_list, lower_bound):
    """Find books with a rating higher than lower_bound"""
    output = []
    for book in books_list:
        if book["rating"] == lower_bound:
            output.append(book)
    return output

doyle_books = find_by_author(books, "Doyle")
doyle_books_above_4 = find_by_rating(doyle_books, 4)

print(f"{doyle_books_above_4 = }")

This code now gives the following result:

output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}]
output = []
output = []
doyle_books_above_4 = []

The first line is correct. You expect the first book in the list to be added to output since it’s an Arthur Conan Doyle book. However, you expect it to still be there in the second line. “The Sign of Four” should have been added to “A Study in Scarlet”. Instead, it seems like it has replaced it.

You notice the same clues for the other results, too. In fact, the list is empty in the fourth and fifth outputs. (The final empty list is the output from the final print() at the end of the code.)

You interrogated output as a witness, but it’s actually a suspect now! Therefore, you study its movements across the crime scene, sketching things on a whiteboard with lots of arrows, as they do in the detective films.

Gotcha! You finally see it. The code is re-initialising output every time inside the for loop. That’s a serious crime. You move the line with output = [] outside the loop:

books = [
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "A Study in Scarlet",
        "published": 1887,
        "rating": 4.14,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Sign of Four",
        "published": 1890,
        "rating": 3.92,
    },
    {
        "author": ("Arthur Conan", "Doyle"),
        "title": "The Hound of the Baskervilles",
        "published": 1901,
        "rating": 4.13,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Murder of the Orient Express (Hercule Poirot #4)",
        "published": 1926,
        "rating": 4.26,
    },
    {
        "author": ("Agatha", "Christie"),
        "title": "Death on the Nile (Hercule Poirot #17)",
        "published": 1937,
        "rating": 4.12,
    },
]

def find_by_author(books_list, last_name):
    """Find books by author's last name"""
    # Note, you could use list comprehensions, but I'm using
    # long form for loop to make debugging easier
    output = []
    for book in books_list:
        if book["author"][1] == last_name:
            output.append(book)
        print(f"{output = }")
    return output

def find_by_rating(books_list, lower_bound):
    """Find books with a rating higher than lower_bound"""
    output = []
    for book in books_list:
        if book["rating"] == lower_bound:
            output.append(book)
    return output

doyle_books = find_by_author(books, "Doyle")
doyle_books_above_4 = find_by_rating(doyle_books, 4)

print(f"{doyle_books_above_4 = }")

The code now gives the following. Note that you’re still interrogating output after the for loop through a print() call:

output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}]
doyle_books_above_4 = []

You can now remove output from your list of suspects as the five print-outs you get are what you expect. The first three show the Arthur Conan Doyle titles, added one at a time. The last two do not add the Agatha Christie books to the list output.

This is what you expect find_by_author() to do!

Link Crime Scene To Crime Scene And Provide Intelligence On Crime Patterns

Criminals rarely commit just one crime. No wonder one of the guidelines from the College of Policing is to link crime scenes and look for crime patterns.

Don’t assume there’s only one bug in your code. And bugs may well be interconnected. You may think you’ve solved the mystery, only to find that there’s another crime scene to investigate!

In the last output from the code above, you may have noticed that the final line still shows an empty list! Your detective work leads you to a different crime scene now. You need to explore the find_by_ratings() function definition.

But, by now, you’re a senior detective and very experienced. So I’ll let you finish off the investigation yourself!

End Of Investigation

Although I couldn’t find the titles “Sherlock Holmes and the Python Bugs” or “Debugging Python on the Nile” in my local library, I think it’s only a matter of time until we have a new genre of crime fiction novels based on debugging Python code. They’ll make for gripping reading.

In the meantime, you can read Sherlock Holmes and Hercule Poirot books to learn how to debug Python code. Or maybe not…

Further Reading

The post Debugging Python Code Is Like Detective Work — Let’s Investigate appeared first on The Python Coding Book.

April 17, 2022 01:09 AM UTC

April 16, 2022


Talk Python to Me

#361: Pangeo Data Ecosystem

Python's place is climate research is an important one. In this episode, you'll meet Joe Hamman and Ryan Abernathy, two researchers using powerful cloud computing systems and Python to understand how the world around us is changing. They are both involved in the Pangeo project which brings a great set of tools for scaling complex compute with Python.<br/> <br/> <strong>Links from the show</strong><br/> <br/> <div><b>Ryan Abernathey</b>: <a href="https://twitter.com/rabernat" target="_blank" rel="noopener">@rabernat</a><br/> <b>Joe Hamman</b>: <a href="https://twitter.com/HammanHydro" target="_blank" rel="noopener">@HammanHydro</a><br/> <b>Pangeo.</b>: <a href="https://pangeo.io/" target="_blank" rel="noopener">pangeo.io</a><br/> <b>xarray</b>: <a href="https://xarray.dev/" target="_blank" rel="noopener">xarray.dev</a><br/> <b>Pangeo Forge</b>: <a href="https://pangeo-forge.org/" target="_blank" rel="noopener">pangeo-forge.org</a><br/> <b>fsspec</b>: <a href="https://filesystem-spec.readthedocs.io/en/latest/features.html" target="_blank" rel="noopener">filesystem-spec.readthedocs.io</a><br/> <b>Step-by-Step Guide to Building a Big Data Portal</b>: <a href="https://medium.com/pangeo/step-by-step-guide-to-building-a-big-data-portal-e262af1c2977" target="_blank" rel="noopener">medium.com</a><br/> <b>Coiled</b>: <a href="https://coiled.io/" target="_blank" rel="noopener">coiled.io</a><br/> <b>Pangeo Gallery</b>: <a href="http://gallery.pangeo.io/" target="_blank" rel="noopener">gallery.pangeo.io</a><br/> <b>Pangeo Quickstart</b>: <a href="https://pangeo.io/quickstart.html#quickstart" target="_blank" rel="noopener">pangeo.io</a><br/> <b>JupyterLite</b>: <a href="https://jupyterlite.readthedocs.io/en/latest/" target="_blank" rel="noopener">jupyterlite.readthedocs.io</a><br/> <b>Jupyter</b>: <a href="https://jupyter.org/" target="_blank" rel="noopener">jupyter.org</a><br/> <b>Pangeo Packages</b>: <a href="https://pangeo.io/packages.html#packages" target="_blank" rel="noopener">pangeo.io</a><br/> <b>Pangeo Discourse</b>: <a href="https://discourse.pangeo.io/" target="_blank" rel="noopener">discourse.pangeo.io</a><br/> <b>Watch this episode on YouTube</b>: <a href="https://www.youtube.com/watch?v=T3jUatZ1KTo" target="_blank" rel="noopener">youtube.com</a><br/> <br/> <b>--- Stay in touch with us ---</b><br/> <b>Subscribe on YouTube</b>: <a href="https://talkpython.fm/youtube" target="_blank" rel="noopener">youtube.com</a><br/> <b>Follow Talk Python on Twitter</b>: <a href="https://twitter.com/talkpython" target="_blank" rel="noopener">@talkpython</a><br/> <b>Follow Michael on Twitter</b>: <a href="https://twitter.com/mkennedy" target="_blank" rel="noopener">@mkennedy</a><br/></div><br/> <strong>Sponsors</strong><br/> <a href='https://talkpython.fm/signalwire'>SignalWire</a><br> <a href='https://talkpython.fm/sentry'>Sentry Error Monitoring, Code TALKPYTHON</a><br> <a href='https://talkpython.fm/training'>Talk Python Training</a>

April 16, 2022 08:00 AM UTC

April 15, 2022


Real Python

The Real Python Podcast – Episode #106: Class Constructors & Pythonic Image Processing

Do you know the difference between creating a class instance and initializing it? Would you like an interactive tour of the Python Pillow library? This week on the show, Christopher Trudeau is here, and he's brought another batch of PyCoder's Weekly articles and projects.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

April 15, 2022 12:00 PM UTC


Python Bytes

#279 Autocorrect and other Git Tricks

<p><strong>Watch the live stream:</strong></p> <a href='https://www.youtube.com/watch?v=RXqP1q8Yp1g' style='font-weight: bold;'>Watch on YouTube</a><br> <br> <p><strong>About the show</strong></p> <p>Sponsored by Datadog: <a href="http://pythonbytes.fm/datadog"><strong>pythonbytes.fm/datadog</strong></a></p> <p>Special guest: Brian Skinn (<a href="https://twitter.com/btskinn">Twitter</a> | <a href="https://github.com/bskinn">Github</a>)</p> <p><strong>Michael #1:</strong> <a href="https://www.openbb.co"><strong>OpenBB wants to be an open source challenger to Bloomberg Terminal</strong></a></p> <ul> <li>OpenBB Terminal provides a modern Python-based integrated environment for investment research, that allows an average joe retail trader to leverage state-of-the-art Data Science and Machine Learning technologies.</li> <li>As a modern Python-based environment, OpenBBTerminal opens access to numerous Python data libraries in <ul> <li>Data Science (Pandas, Numpy, Scipy, Jupyter)</li> <li>Machine Learning (Pytorch, Tensorflow, Sklearn, Flair)</li> <li>Data Acquisition (Beautiful Soup, and numerous third-party APIs)</li> </ul></li> <li>They have a discord community too</li> <li>BTW, seem to be a successful open source project: <a href="https://finance.yahoo.com/news/openbb-raises-8-5m-seed-160000637.html">OpenBB Raises $8.5M in Seed Round Funding Following Open Source Project Gamestonk Terminal's Success</a></li> <li>Great <a href="http://">graphics / gallery</a> here.</li> <li>Way more affordable than the $1,900/mo/user for the Bloomberg Terminal</li> </ul> <p><strong>Brian #2:</strong> <strong>Python f-strings</strong></p> <ul> <li><a href="https://fstring.help/"><strong>https://fstring.help</strong></a> <ul> <li>Florian Bruhin</li> <li>Quick overview of cool features of f-strings, made with Jupyter</li> </ul></li> <li><a href="https://towardsdatascience.com/python-f-strings-are-more-powerful-than-you-might-think-8271d3efbd7d"><strong>Python f-strings Are More Powerful Than You Might Think</strong></a> <ul> <li>Martin Heinz</li> <li>More verbose discussion of f-strings</li> </ul></li> <li>Both are great to up your string formatting game.</li> </ul> <p><strong>Brian S. #3: pyproject.toml and PEP 621 Support in setuptools</strong></p> <ul> <li><a href="https://peps.python.org/pep-0621/">PEP 621: “Storing project metadata in pyproject.toml”</a> <ul> <li>Authors: Brett Cannon, Dustin Ingram, Paul Ganssle, Pradyun Gedam, Sébastien Eustace, Thomas Kluyver, Tzu-ping Chung (Jun-Oct 2020)</li> <li>Covers build-tool-independent fields (<code>name</code>, <code>version</code>, <code>description</code>, <code>readme</code>, <code>authors</code>, etc.)</li> </ul></li> <li>Various tools had already implemented pyproject.toml support, but not setuptools <ul> <li>Including: Flit, Hatch, PDM, Trampolim, and Whey (h/t: <a href="https://scikit-hep.org/developer/pep621">Scikit-HEP</a>)</li> <li>Not Poetry yet, though it's <a href="https://github.com/python-poetry/poetry/issues/3332">under discussion</a></li> </ul></li> <li>setuptools support had been <a href="https://github.com/pypa/setuptools/issues/1688">discussed pretty extensively</a>, and had been included on the PSF’s list of <a href="https://github.com/psf/fundable-packaging-improvements/blob/master/FUNDABLES.md#add-support-for-pyprojecttoml-as-a-way-to-configure-setuptools">fundable packaging improvements</a></li> <li>Initial experimental implementation spearheaded by <a href="https://github.com/abravalheri">Anderson Bravalheri</a>, recently completed <ul> <li>Seeking testing and bug reports from the community (<a href="https://discuss.python.org/t/help-testing-experimental-features-in-setuptools/13821">Discuss thread</a>)</li> <li>I tried it on one of my projects — it mostly worked, but revealed a <a href="https://github.com/pypa/setuptools/issues/3244">bug</a> that Anderson fixed super-quick (proper handling of a dynamic <code>long_description</code>, defined in <code>setup.py</code>)</li> </ul></li> <li>Related tools (all early-stage/experimental AFAIK) <ul> <li><a href="https://github.com/abravalheri/ini2toml">ini2toml</a> (Anderson Bravalheri) — Can convert setup.cfg (which is in INI format) to pyproject.toml <ul> <li>Mostly worked well for me, though I had to manually fix a couple things, most of which were due to limitations of the INI format <ul> <li>INI has no list syntax!</li> </ul></li> </ul></li> <li><a href="https://github.com/abravalheri/validate-pyproject">validate-pyproject</a> (Anderson Bravalheri) — Automated pyproject.toml checks</li> <li><a href="https://github.com/tox-dev/pyproject-fmt">pyproject-fmt</a> (Bernát Gábor) — Autoformatter for pyproject.toml</li> </ul></li> <li>Don’t forget to use it with <code>build</code>, instead of via a <code>python setup.py</code> invocation! <ul> <li><code>$ pip install build</code> <code>$ python -m build</code></li> </ul></li> <li>Will also want to constrain your <code>setuptools</code> version in the <code>build-backend.requires</code> key of <code>pyproject.toml</code> (you <em>are</em> using PEP517/518, right??)</li> </ul> <p><strong>Michael #4:</strong> <a href="https://jwt.io">JSON Web Tokens @ jwt.io</a></p> <ul> <li>JSON Web Tokens are an open, industry standard <a href="https://tools.ietf.org/html/rfc7519"><strong>RFC 7519</strong></a> method for representing claims securely between two parties.</li> <li>Basically a visualizer and debugger for JWTs <ul> <li>Enter an encoded token</li> <li>Select a decryption algorithm</li> <li>See the payload data</li> <li>verify the signature</li> </ul></li> <li><a href="https://jwt.io/libraries">List of libraries</a>, grouped by language</li> </ul> <p><strong>Brian #5:</strong> <a href="https://waylonwalker.com/til/git-config-help-autocorrect/"><strong>Autocorrect</strong></a> <strong>and other Git Tricks</strong></p> <pre><code>- Waylon Walker - Use `git config --global help.autocorrect 10` to have git automatically run the command you meant in 1 second. The `10` is 10 x 1/10 of a second. So `50` for 5 seconds, etc. </code></pre> <p></p> <ul> <li>Automatically set upstream branch if it’s not there <ul> <li><code>git config --global push.default current</code></li> <li>You may NOT want to do this if you are not careful with your branches.</li> <li>From https://stackoverflow.com/a/22933955</li> </ul></li> <li><code>git commit -a</code> <ul> <li>Automatically “add” all changed and deleted files, but not untracked files.</li> <li>From https://git-scm.com/docs/git-commit#Documentation/git-commit.txt--a </li> </ul></li> <li>Now most of my interactions with git CLI, especially for quick changes, is: $ git checkout main $ git pull $ git checkout -b okken_something $ git commit -a -m 'quick message' $ git push</li> <li>With these working, with autocorrect $ git chkout main $ git pll $ git comit -a -m 'quick message' $ git psh</li> </ul> <p><strong>Brian S. #6: jupyter-tempvars</strong></p> <ul> <li>Jupyter notebooks are great, and the global namespace of the Python kernel backend makes it super easy to flow analysis from one cell to another</li> <li>BUT, that global namespace also makes it super easy to footgun, when variables leak into/out of a cell when you don’t want them to</li> <li><a href="https://github.com/bskinn/jupyter-tempvars">jupyter-tempvars</a> notebook extension <ul> <li>Built on top of the <a href="https://github.com/bskinn/tempvars">tempvars</a> library, which defines a <code>TempVars</code> context manager for handling temporary variables <ul> <li>When you create a <code>TempVars</code> context manager, you provide it patterns for variable names to treat as temporary</li> <li>In its simplest form, <code>TempVars</code> (1) clears matching variables from the namespace on entering the context, and then (2) clears them <em>again</em> upon exiting the context, and restoring their prior values, if any</li> <li><code>TempVars</code> works great, but it’s cumbersome and distracting to manually include it in every notebook cell where it’s needed</li> </ul></li> <li>With <code>jupyter-tempvars</code>, you instead apply tags with a specific format to notebook cells, and the extension automatically wraps each cell’s code in a <code>TempVars</code> context before execution</li> </ul></li> <li>Javascript adapted from existing extensions <ul> <li>Patching <code>CodeCell.execute</code>, from the <code>jupyter_contrib_nbextensions</code> ‘<a href="https://github.com/ipython-contrib/jupyter_contrib_nbextensions/blob/a186b18efaa1f55fba64f08cd9d8bf85cba56d25/src/jupyter_contrib_nbextensions/nbextensions/execution_dependencies/execution_dependencies.js#L31-L33">Execution Dependencies</a>’ extension, to <a href="https://github.com/bskinn/jupyter-tempvars/blob/491babaca4f48c8d453ce4598ac12aa6c5323181/src/jupyter_tempvars/extension/jupyter_tempvars.js#L127-L143">enclose the cell code with the context manager</a></li> <li>Listening for the ‘kernel ready’ event, from <code>[jupyter-black](https://github.com/drillan/jupyter-black/blob/d197945508a9d2879f2e2cc99cafe0cedf034cf2/kernel_exec_on_cell.js#L347-L350)</code>, to <a href="https://github.com/bskinn/jupyter-tempvars/blob/491babaca4f48c8d453ce4598ac12aa6c5323181/src/jupyter_tempvars/extension/jupyter_tempvars.js#L42-L46">import the</a> <code>[TempVars](https://github.com/bskinn/jupyter-tempvars/blob/491babaca4f48c8d453ce4598ac12aa6c5323181/src/jupyter_tempvars/extension/jupyter_tempvars.js#L42-L46)</code> <a href="https://github.com/bskinn/jupyter-tempvars/blob/491babaca4f48c8d453ce4598ac12aa6c5323181/src/jupyter_tempvars/extension/jupyter_tempvars.js#L42-L46">context manager upon kernel (re)start</a></li> </ul></li> <li>See the <a href="https://github.com/bskinn/jupyter-tempvars/blob/main/README.md">README</a> (with animated GIFs!) for installation and usage instructions <ul> <li>It’s on PyPI: <code>$ pip install jupyter-tempvars</code></li> <li>And, I made a shortcut install script for it: <code>$ jupyter-tempvars install &amp;&amp; jupyter-tempvars enable</code></li> </ul></li> <li>Please try it out, <a href="https://github.com/bskinn/jupyter-tempvars/issues">find/report bugs, and suggest features</a>!</li> <li>Future work <ul> <li>Publish to conda-forge (definitely)</li> <li>Adapt to JupyterLab, VS Code, etc. (pending interest)</li> </ul></li> </ul> <p><strong>Extras</strong> </p> <p>Brian:</p> <ul> <li>Ok. <a href="https://discuss.python.org/t/github-issues-are-now-live">Python issues are now on GitHub.</a> Seriously. <a href="https://github.com/python/cpython/issues">See for yourself.</a></li> <li><a href="https://www.lipsum.com/">Lorem Ipsum is more interesting than I realized.</a></li> <li><a href="https://dev.to/rly">O RLY Cover Generator</a></li> <li>Example: <br /> <img src="https://paper-attachments.dropbox.com/s_E9964F8854B7AC3C00CC5A9D2E286CD011C21B235E0D7DEA457256E82F09C5FE_1649825870185_100_percent_coverage 2.png" alt="" /></li> </ul> <p>Michael:</p> <ul> <li>New course: <a href="https://talkpython.fm/fastapi-azure-ad">Secure APIs with FastAPI and the Microsoft Identity Platform</a></li> <li><a href="https://github.com/seth-c-stenzel/Pyenv-Virtualenv-for-Windows-Sorta-ish">Pyenv Virtualenv for Windows</a> <a href="http://">(Sorta'ish)</a></li> <li><a href="https://hipsum.co/">Hipster Ipsum</a></li> </ul> <p>Brian S.: </p> <ul> <li>PSF staff is expanding <ul> <li><a href="https://www.python.org/jobs/6261/">PSF hiring an Infrastructure Engineer</a> <ul> <li>Link now 404s, perhaps they’ve made their hire?</li> </ul></li> <li>Last year’s <a href="https://pyfound.blogspot.com/2021/08/shamika-mohanan-has-joined-psf-as.html">hire of the Packaging Project Manager</a> (Shamika Mohanan)</li> <li>Steering Council <a href="https://github.com/encukou/steering-council/blob/a9185ebe4de915b59290a1ebb380e93e6a8d3c4f/updates/2022-03-steering-council-update.md#2022-03-07">supports PSF hiring a second developer-in-residence</a></li> </ul></li> <li>PSF has <a href="https://twitter.com/ThePSF/status/1512078490689961993">chosen</a> its new Executive Director: <a href="https://twitter.com/baconandcoconut">Deb Nicholson</a>!</li> <li><a href="https://www.pyohio.org/2022/speaking/call-for-proposals/">PyOhio 2022 Call for Proposals</a> is open</li> <li>Teaser <a href="https://twitter.com/samuel_colvin/status/1512872630696747022">tweet</a> for performance improvements to pydantic </li> </ul> <p><strong>Jokes:</strong> </p> <p><a href="https://twitter.com/CaNerdIan/status/1512628780212396036">https://twitter.com/CaNerdIan/status/1512628780212396036</a></p> <p><a href="https://www.reddit.com/r/ProgrammerHumor/comments/tuh06y/i_guess_we_all_have_been_there/">https://www.reddit.com/r/ProgrammerHumor/comments/tuh06y/i_guess_we_all_have_been_there/</a></p> <p><a href="https://twitter.com/PR0GRAMMERHUM0R/status/1507613349625966599">https://twitter.com/PR0GRAMMERHUM0R/status/1507613349625966599</a></p>

April 15, 2022 08:00 AM UTC

April 14, 2022


Ned Batchelder

Python custom formatting

Python f-strings use a formatting mini-language, the same as the older .format() function. After the colon comes short specifications for how to format the value:

>>> word = "Hello"

>>> f"{word:/^20}"
'///////Hello////////'
>>> amt = 12345678
>> f"{amt:20,}"
'          12,345,678'

Datetimes can use strftime syntax:

>>> f"{now:%Y-%m on day %d}"

'2022-04 on day 14'

The reason datetime uses different formatting specs than strings is because datetime defines its own __format__ method. Any object can define its own formatting mini-language. F-strings and .format() will use the __format__ method on an object, and pass it the formatting directives being used:

>>> class Confused:

...     def __format__(self, fmt):
...         return f"What is {fmt}?"
...
>>> c = Confused()
>>> f"{c:xyz12}"
'What is xyz12?'

Of course, __format__ can be used for more useful formatting than Confused is doing...

Geographic latitude and longitude are conventionally presented in a few different formats: degrees; or degrees and minutes; or degrees, minutes and seconds. Then the numbers can have varying number of decimal places, and sometimes the units are represented by symbols.

Here’s an implementation of those possibilities in __format__. The format string starts with “d”, “dm”, or “dms” to indicate the basic format. The number of decimal places can be specified with “.N”. Finally, symbols can be added, either plain or fancy, by adding a quote or minute symbol:

import dataclasses, re


@dataclasses.dataclass
class LatLong:
    lat: float
    long: float

    def __format__(self, fmt):
        dms, nfmt, opts = re.fullmatch(r"(dm?s?)([.\d]*)([′']?)", fmt).groups()
        formatted = []
        for num in [self.lat, self.long]:
            parts = []
            for ms in dms[1:]:
                parts.append(str(int(num)))
                num = abs((num - int(num)) * 60)
            parts.append(format(num, nfmt + "f"))
            syms = None
            if "'" in opts:
                syms = "°'\""
            elif "′" in opts:
                syms = "°′″"
            if opts:
                parts = [p + s for p, s in zip(parts, syms)]
            formatted.append(" ".join(parts))
        joined = ", ".join(formatted)
        return joined
>>> where = LatLong(42.359764937, -71.092068768)

>>> print(f"Location: {where:d'}")
Location: 42.359765°, -71.092069°
>>> print(f"Location: {where:d.4}")
Location: 42.3598, -71.0921
>>> print(f"Location: {where:dm'}")
Location: 42° 21.585896', -71° 5.524126'
>>> print(f"Location: {where:dms.4'}")
Location: 42° 21' 35.1538", -71° 5' 31.4476"
>>> print(f"Location: {where:dms.4}")
Location: 42 21 35.1538, -71 5 31.4476
>>> print(f"Location: {where:dms.6′}")
Location: 42° 21 35.153773, -71° 5 31.447565
>>> print("There: {:dms.6′}".format(where))
There: 42° 21 35.153773, -71° 5 31.447565
>>> print(format(where, "dms.6′"))
42° 21 35.153773, -71° 5 31.447565

This implementation doesn’t handle errors properly, but shows the basic idea. Also, lat/long are often shown with N/S E/W instead of positive and negative values. That’s left as an exercise for the reader.

April 14, 2022 11:35 PM UTC


Mike Driscoll

Python 101 - Intro to Graphing with Python and Matplotlib (Video)

Python has lots of data visualization packages available to it. One of the most popular is Matplotlib.

In this video tutorial, will be learning about the following topics:

Related Articles

The post Python 101 - Intro to Graphing with Python and Matplotlib (Video) appeared first on Mouse Vs Python.

April 14, 2022 07:08 PM UTC

April 13, 2022


Anarcat

Tuning my wifi radios

After listening to an episode of the 2.5 admins podcast, I realized there was some sort of low-hanging fruit I could pick to better tune my WiFi at home. You see, I'm kind of a fraud in WiFi: I only started a WiFi mesh in Montreal (now defunct), I don't really know how any of that stuff works. So I was surprised to hear one of the podcast host say "it's all about airtime" and "you want to reduce the power on your access points" (APs). It seemed like sound advice: better bandwidth means less time on air, means less collisions, less latency, and less power also means less collisions. Worth a try, right?

Frequency

So the first thing I looked at was WifiAnalyzer to see if I had any optimisation I could do there. Normally, I try to avoid having nearby APs on the same frequency to avoid collisions, but who knows, maybe I had messed that up. And turns out I did! Both APs were on "auto" for 5GHz, which typically means "do nothing or worse".

5GHz is really interesting, because, in theory, there are LOTS of channels to pick from, it goes up to 196!! And both my APs were on 36, what gives?

So the first thing I did was to set it to channel 100, as there was that long gap in WifiAnalyzer where no other AP was. But that just broke 5GHz on the AP. The OpenWRT GUI (luci) would just say "wireless not associated" and the ESSID wouldn't show up in a scan anymore.

At first, I thought this was a problem with OpenWRT or my hardware, but I could reproduce the problem with both my APs: a TP-Link Archer A7 v5 and a Turris Omnia (see also my review).

As it turns out, that's because that range of the WiFi band interferes with trivial things like satellites and radar, which make the actually very useful radar maps look like useless christmas trees. So those channels require DFS to operate. DFS works by first listening on the frequency for a certain amount of time (1-2 minute, but could be as high as 10) to see if there's something else transmitting at all.

So typically, that means they just don't operate at all in those bands, especially if you're near any major city which generally means you are near a weather radar that will transmit on that band.

In the system logs, if you have such a problem, you might see this:

Apr  9 22:17:39 octavia hostapd: wlan0: DFS-CAC-START freq=5500 chan=100 sec_chan=1, width=0, seg0=102, seg1=0, cac_time=60s
Apr  9 22:17:39 octavia hostapd: DFS start_dfs_cac() failed, -1

... and/or this:

Sat Apr  9 18:05:03 2022 daemon.notice hostapd: Channel 100 (primary) not allowed for AP mode, flags: 0x10095b NO-IR RADAR
Sat Apr  9 18:05:03 2022 daemon.warn hostapd: wlan0: IEEE 802.11 Configured channel (100) not found from the channel list of current mode (2) IEEE 802.11a
Sat Apr  9 18:05:03 2022 daemon.warn hostapd: wlan0: IEEE 802.11 Hardware does not support configured channel

Here, it clearly says RADAR (in all caps too, which means it's really important). NO-IR is also important, I'm not sure what it means but it could be that you're not allowed to transmit in that band because of other local regulations.

There might be a way to workaround those by changing the "region" in the Luci GUI, but I didn't mess with that, because I figured that other devices will have that already configured. So using a forbidden channel might make it more difficult for clients to connect (although it's possible this is enforced only on the AP side).

In any case, 5GHz is promising, but in reality, you only get from channel 36 (5.170GHz) to 48 (5.250GHz), inclusively. Fast counters will notice that is exactly 80MHz, which means that if an AP is configured for that hungry, all-powerful 80MHz, it will effectively take up all 5GHz channels at once.

This, in other words, is as bad as 2.4GHz, where you also have only two 40MHz channels. (Really, what did you expect: this is an unregulated frequency controlled by commercial interests...)

So the first thing I did was to switch to 40MHz. This gives me two distinct channels in 5GHz at no noticeable bandwidth cost. (In fact, I couldn't find hard data on what the bandwidth ends up being on those frequencies, but I could still get 400Mbps which is fine for my use case.)

Power

The next thing I did was to fiddle with power. By default, both radios were configured to transmit as much power as they needed to reach clients, which means that if a client gets farther away, it would boost its transmit power which, in turns, would mean the client would still connect to instead of failing and properly roaming to the other AP.

The higher power also means more interference with neighbors and other APs, although that matters less if they are on different channels.

On 5GHz, power was about 20dBm (100 mW) -- and more on the Turris! -- when I first looked, so I tried to lower it drastically to 5dBm (3mW) just for kicks. That didn't work so well, so I bumped it back up to 14 dBm (25 mW) and that seems to work well: clients hit about -80dBm when they get far enough from the AP, which gets close to the noise floor (and where the neighbor APs are), which is exactly what I want.

On 2.4GHz, I lowered it down even further, to 10 dBm (10mW) since it's better at going through wells, I figured it would need less power. And anyways, I rather people use the 5GHz APs, so maybe that will act as an encouragement to switch. I was still able to connect correctly to the APs at that power as well.

Other tweaks

I disabled the "Allow legacy 802.11b rates" setting in the 5GHz configuration. According to this discussion:

Checking the "Allow b rates" affects what the AP will transmit. In particular it will send most overhead packets including beacons, probe responses, and authentication / authorization as the slow, noisy, 1 Mb DSSS signal. That is bad for you and your neighbors. Do not check that box. The default really should be unchecked.

This, in particular, "will make the AP unusable to distant clients, which again is a good thing for public wifi in general". So I just unchecked that box and I feel happier now. I didn't make tests to see the effect separately however, so this is mostly just a guess.

April 13, 2022 08:56 PM UTC


PyCharm

Introducing PyCharm 2022.1!

In this first release of 2022, we decided to focus on polishing existing features and workflows instead of adding new functionality, especially after our previous release that introduced multiple long-awaited features such as support for Jupyter and Remote Development, respectively. Here is a summary of what’s new in PyCharm 2022.1.

Download PyCharm 2022.1

IDE

Authentication support for custom package repositories

Now you can configure basic HTTP authentication to access custom package repositories and manage dependencies via PyCharm without switching to the terminal for manual installation.

Go to the Python Packages tool window, click on the gear icon, click the plus sign in the dialog window, add the repository URL, and then select the Basic HTTP option to enter the required credentials. The new repository will appear on the list of packages in the left-hand side window.

Code insight

Enhanced code completion for TypedDict

Dict literals can be used as arguments for functions or to instantiate objects from classes where TypedDict is expected. Doing so became even easier in PyCharm 2022.1 thanks to its new code completion for keys.

Improved TypedDict per-key warnings

We also improved the warnings for TypedDict. Now, when a dictionary created as a literal or by using the dict constructor is used where TypedDict is expected, PyCharm will display per-key error messages pointing to the individual values that are wrong, missing, or not expected.

Improved Markdown support

Run commands from Markdown files

You often find instructions with commands to execute when working with Markdown files, such as in README files, for example. Now you can just run those commands directly from the file itself, by using the run icon in the gutter.

Copy code snippet for Markdown

Copying and pasting code snippets from Markdown files is also very common, and now you can use the new Copy code snippet to do this, available in Markdown blocks.

Jupyter Support [Pro]

Code cells remain in Edit mode after execution

In order to make working with Jupyter notebooks a more fluid experience, new cells inserted after you Run Cell and Insert Below now default to Edit mode, so you can start writing code in it right away.

The same behavior applies when you are editing and decide to run your current cell. It will remain in Edit mode even after being executed.

Optimized cell copy-pasting

To make your Jupyter experience more pleasant still, we changed the copy/cut and paste behavior for when you copy or cut a cell that has an output while in Command mode. Now, whenever you paste it back to your notebook, the output will also be pasted and you won’t need to execute the cell again.

Databases [Pro]

MongoDB: Editing fields in results

In PyCharm 2022.1, editing the results in MongoDB collections or result sets obtained via .find() became as easy as it is in relational databases. This improvement also works when cursor methods that modify the result, such as sort() or limit(), are executed after .find().

Docker [Pro]

New Services UI for Docker

Working with Docker became easier and more organized in PyCharm 2022.1. We’ve significantly reworked the Docker UI in the Services tool window to give you much clearer control of your containers, images, networks, and volumes.

These are all the features we wanted to highlight. Read about other features included in this release on our What’s New page, or check the [release notes]() for the full list of implemented features and bug fixes.

As always, your feedback is highly appreciated. Share it with us on Twitter (@pycharm), or by reporting any bugs or requests to our tracker.

Happy coding!

The PyCharm team

April 13, 2022 05:19 PM UTC


Malthe Borch

Automatic HTTPS on Kubernetes

The ingress controller supported by the Kubernetes project itself is nginx. And while there are recipes for setting up automated issuing of TLS certificates using free CAs such as Let's Encrypt, there are quite a few steps involved and you will need to deploy additional services to your cluster to make it work.

Meanwhile, the ingress controller for Caddy does it fully-automated, out-of-the-box.

Enable it during install using the onDemandTLS option like so:

$ helm install \
    --namespace=caddy-system \
    --repo https://caddyserver.github.io/ingress/ \
    --atomic \
    --set image.tag=v0.1.0 \
    --set ingressController.config.onDemandTLS=true \
    --set ingressController.config.email=<your-email> \
    --set replicaCount=1 \
    --version 1.0.0 \
    main \
    caddy-ingress-controller

The email option is to allow the CA to send expiry notices if your certificate is coming up for renewal. I suppose that doesn't hurt.

Sometimes it's nice to point a domain to localhost and have HTTPS working for it nonetheless – for example, when testing out authentication flows.

I use a combination of tools to achieve this:

In the real world, my domain is pointing to the Kubernetes cluster. But since I don't have nginx running as my ingress controller, I need an actual service to reply to the ACME requests that will be sent to <my-domain>/.well-known/acme-challenge/<key>.

Python to the rescue!

I added a deployment to the Kubernetes cluster with an image set to python:slim-bullseye and simply mounted the script below as /scripts/main.py using a configmap.

from os import environ
from http.server import BaseHTTPRequestHandler, HTTPServer

PORT = int(environ.get("PORT", 8080))
ACCOUNT_THUMBPRINT = environ["ACCOUNT_THUMBPRINT"]

class handler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.send_header("Content-type", "text/plain")
        self.end_headers()
        challenge = self.path.rsplit("/")[-1]
        message = f"{challenge}.{ACCOUNT_THUMBPRINT}"
        self.wfile.write(bytes(message, "ascii"))

with HTTPServer(("", PORT), handler) as server:
    server.serve_forever()

The deployment is set to run python /scripts/main.py. The account thumbprint is a secret key that you get when you register a session with the ACME shell script.

Kind of complicated – but at least now I can issue a TLS certificate for my domain any time using:

$ acme.sh --issue -d example.com --stateless

The setup would be a little smoother if I had published a ready-to-go container image with the script included.

April 13, 2022 04:38 PM UTC


Real Python

Python Virtual Environments: A Primer

In this tutorial, you’ll learn how to work with Python’s venv module to create and manage separate virtual environments for your Python projects. Each environment can use different versions of package dependencies and Python. After you’ve learned to work with virtual environments, you’ll know how to help other programmers reproduce your development setup, and you’ll make sure that your projects never cause dependency conflicts for one another.

By the end of this tutorial, you’ll know how to:

  • Create and activate a Python virtual environment
  • Explain why you want to isolate external dependencies
  • Visualize what Python does when you create a virtual environment
  • Customize your virtual environments using optional arguments to venv
  • Deactivate and remove virtual environments
  • Choose additional tools for managing your Python versions and virtual environments

Virtual environments are a common and effective technique used in Python development. Gaining a better understanding of how they work, why you need them, and what you can do with them will help you master your Python programming workflow.

Free Bonus: Click here to get access to a free 5-day class that shows you how to avoid common dependency management issues with tools like Pip, PyPI, Virtualenv, and requirements files.

Throughout the tutorial, you can select code examples for either Windows, Ubuntu Linux, or macOS. Pick your platform at the top right of the relevant code blocks to get the commands that you need, and feel free to switch between your options if you want to learn how to work with Python virtual environments on other operating systems.

How Can You Work With a Python Virtual Environment?

If you just need to get a Python virtual environment up and running to continue working on your favorite project, then this section is the right place for you.

The instructions in this tutorial use Python’s venv module to create virtual environments. This module is part of Python’s standard library, and it’s the officially recommended way to create virtual environments since Python 3.5.

Note: There are other great third-party tools for creating virtual environments, such as conda and virtualenv, that you can learn more about later in this tutorial. Any of these tools can help you set up a Python virtual environment.

For basic usage, venv is an excellent choice because it already comes packaged with your Python installation. With that in mind, you’re ready to create your first virtual environment in this tutorial.

Create It

Any time you’re working on a Python project that uses external dependencies that you’re installing with pip, it’s best to first create a virtual environment:

PS> python -m venv venv

If you’re using Python on Windows and you haven’t configured the PATH and PATHEXT variables, then you might need to provide the full path to your Python executable:

PS> C:\Users\Name\AppData\Local\Programs\Python\Python310\python -m venv venv

The system path shown above assumes that you installed Python 3.10 using the Windows installer provided by the Python downloads page. The path to the Python executable on your system might be different. Working with PowerShell, you can find the path using the where.exe python command.

$ python3 -m venv venv

Many Linux operating systems ship with a version of Python 3. If python3 doesn’t work, then you’ll have to first install Python, and you may need to use the specific name of the executable version that you installed, for example python3.10 for Python 3.10.x. If that’s the case for you, remember to replace mentions of python3 in the code blocks with your specific version number.

$ python3 -m venv venv

Older versions of macOS come with a system installation of Python 2.7.x that you should never use to run your scripts. If you’re working on macOS < 12.3 and invoke the Python interpreter with python instead of python3, then you might accidentally start up the outdated system Python interpreter.

If running python3 doesn’t work, then you’ll have to first install a modern version of Python.

Activate It

Great! Now your project has its own virtual environment. Generally, before you start using it, you’ll first activate the environment by executing a script that comes with the installation:

PS> venv\Scripts\Activate.ps1
(venv) PS>
$ source venv/bin/activate
(venv) $

Before you run this command, make sure that you’re in the folder that contains the virtual environment you just created.

Note: You can also work with your virtual environment without activating it. To do this, you’ll provide the absolute path to its Python interpreter when executing a command. However, most commonly, you’ll want to activate the virtual environment after creating it to save yourself the effort of repeatedly having to type long absolute paths.

Once you can see the name of your virtual environment—in this case (venv)—in your command prompt, then you know that your virtual environment is active. You’re all set and ready to install your external packages!

Install Packages Into It

After creating and activating your virtual environment, you can now install any external dependencies that you need for your project:

Read the full article at https://realpython.com/python-virtual-environments-a-primer/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

April 13, 2022 02:00 PM UTC


Stack Abuse

Guide to Dictionaries in Python

Introduction

Python comes with a variety of built-in data structures, capable of storing different types of data. A Python dictionary is one such data structure that can store data in the form of key-value pairs - conceptually similar to a map. The values in a Python dictionary can be accessed using the keys.

In this guide, we will be discussing Python dictionaries in detail. Firstly, we'll cover the basic dictionary operations (creating a dictionary, updating it, removing and adding elements, etc.) and take a look at a couple more interesting methods afterward.

How To Create a Dictionary in Python

To create a Python dictionary, we pass a sequence of items (entries) inside curly braces {} and separate them using a comma (,). Each entry consists of a key and a value, also known as a key-value pair.

Note: The values can belong to any data type and they can repeat, but the keys must remain unique. Additionally, you can't assign multiple values to the same key, though, you can assign a list of values (as a single value).

The following examples demonstrate how to create Python dictionaries.

Creating an empty dictionary:

example_dict = {}

Creating a dictionary with integer keys:

example_dict = {1: 'mango', 2: 'pawpaw'}

Creating a dictionary with mixed keys:

example_dict = {'fruit': 'mango', 1: [4, 6, 8]}

Alternatevly, we can create a dictionary by explicitly calling the Python's dict() method:

example_dict = dict({1:'mango', 2:'pawpaw'})

A dictionary can also be created from a sequence as shown below:

example_dict = dict([(1,'mango'), (2,'pawpaw')])

Dictionaries can also be nested, which means that we can create a dictionary inside another dictionary. For example:

example_dict = {1: {'student1' : 'Nicholas', 'student2' : 'John', 'student3' : 'Mercy'},
        2: {'course1' : 'Computer Science', 'course2' : 'Mathematics', 'course3' : 'Accounting'}}

To print the dictionary contents, we can use Python's print() method and pass the dictionary name as the argument to the method:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

print(example_dict)

This results in:

{'Company': 'Toyota', 'model': 'Premio', 'year': 2012}

How To Access Elements of a Python Dictionary

To access dictionary items - we pass the key, using the square bracket notation:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

x = example_dict["model"]

print(x)

This nets us the value associated with the "model" key:

Premio

You can store "configuration" items or common constants into a dictionary for ease of centralized access:

example_dict = {'Name': 'Mercy', 'Age': 23, 'Course': 'Accounting'}

print("Student Name:", example_dict['Name'])
print("Course:", example_dict['Course'])
print("Age:", example_dict['Age'])

This would result in:

Student Name: Mercy
Course: Accounting
Age: 23

The dictionary object also provides the get() method, which can be used to access dictionary elements as well. We append the method with the dictionary name using the dot operator and then pass the name of the key as the argument to the method:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

x = example_dict.get("model")

print(x)

This results in:

Premio

Now we know how to access dictionary elements! In the next section, we'll discuss how to add new elements to an already existing dictionary.

How To Add Elements to a Python Dictionary

There are numerous ways to add new elements to a dictionary. A common way is to add a new key and assign a value to it:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

example_dict["Capacity"] = "1800CC"

print(example_dict)

When a key doesn't exist, and we assign a value to it - it gets added to the dictionary:

{'Capacity': '1800CC', 'year': 2012, 'Company': 'Toyota', 'model': 'Premio'}

The new element has Capacity as the key and 1800CC as its corresponding value. It has been added as the first element of the dictionary. Here is another example. First, let's first create an empty dictionary:

example_dict = {}

print("An Empty Dictionary: ")
print(example_dict)

Let's verify that it's empty:

An Empty Dictionary:

The dictionary returns nothing as it has nothing stored yet. Let us add some elements to it, one at a time:

example_dict[0] = 'Apples'
example_dict[2] = 'Mangoes'
example_dict[3] = 20

print("\n3 elements have been added: ")
print(example_dict)

This results in:

3 elements have been added:
{0: 'Apples', 2: 'Mangoes', 3: 20}

To add the elements, we specified keys as well as the corresponding values:

example_dict[0] = 'Apples'

In the above example, 0 is the key while Apples is the value. It is even possible for us to add a set of values to one key as long as that set is referencable as a single value, such as a collection:

# These three values are implicitly converted into a set
example_dict['Values'] = 1, "Pairs", 4

print("\n3 elements have been added: ")
print(example_dict)

And we have a key with a set as its value:

3 elements have been added:
{'Values': (1, 'Pairs', 4)}

Other than adding new elements to a dictionary, dictionary elements can also be updated/changed, which we'll go over in the next section.

How To Update Dictionary Elements

After adding a value to a dictionary we can then modify the existing dictionary element. You use the key of the element to change the corresponding value:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

example_dict["year"] = 2014

print(example_dict)

In this example, we've updated the value for the key year from the old value of 2012 to a new value of 2014:

{'year': 2014, 'model': 'Premio', 'Company': 'Toyota'}

How To Remove Dictionary Elements

The removal of an element from a dictionary can be done in several ways, which we'll discuss one by one in this section.

The del keyword can be used to remove the element with the specified key:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

del example_dict["year"]

print(example_dict)

This results in:

{'Company': 'Toyota', 'model': 'Premio'}

We called the del keyword followed by the dictionary name. Inside the square brackets that follow the dictionary name, we passed the key of the element we need to delete from the dictionary, which in this example was year. The entry for year in the dictionary was then deleted.

Another way to delete a key-value pair is to use the pop() method and pass the key of the entry to be deleted as the argument to the method:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

example_dict.pop("year")

print(example_dict)

We invoked the pop() method by appending it with the dictionary name. Running this code will delete the entry for year in the dictionary:

{'Company': 'Toyota', 'model': 'Premio'}

The popitem() method removes the last item inserted into the dictionary, without needing to specify the key. Take a look at the following example:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

example_dict.popitem()

print(example_dict)

The last entry into the dictionary was year. It has been removed after calling the popitem() method:

{'Company': 'Toyota', 'model': 'Premio'}

But what if you want to delete the entire dictionary? It would be difficult and cumbersome to use one of these methods on every single key. Instead, you can use the del keyword to delete the entire dictionary:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

del example_dict

print(example_dict)

But, this code will return an error. The reason is that we are trying to access a dictionary that doesn't exist since it has been deleted beforehand:

NameError: name 'example_dict' is not defined

Depending on the use case, you might need to remove all dictionary elements but not the dictionary itself. This can be achieved by calling the clear() method on the dictionary:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

example_dict.clear()

print(example_dict)

This will give you an empty dictionary (since all the dictionary elements have been removed):

{}

Other Common Dictionary Methods in Python

Besides methods we've covered so far, Python provides us with a lot of other interesting methods helping us perform operations other than the basic ones described before. In the following subsections, we'll take a look at some other methods you can use alongside dictionaries in Python.

len() Method

With this method, you can count the number of elements in a dictionary. For example:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

print(len(example_dict))

There are three entries in the dictionary, hence the method will return 3:

3

copy() Method

This method returns a copy of the existing dictionary. For example:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}
x = example_dict.copy()

print(x)

Let's make sure the copy is properly made and assigned to the variable x:

{'Company': 'Toyota', 'year': 2012, 'model': 'Premio'}

After printing x in the console, you see that it contains the same elements as those stored in the example_dict dictionary.

Note: This is useful because modifications made to the copied dictionary won't affect the original one.

items() Method

When called, this method returns an iterable object. The iterable object has key-value pairs for the dictionary, as tuples in a list. This method is primarily used when you want to iterate through a dictionary.

The method is simply called on the dictionary object name as shown below:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

for k, v in example_dict.items():
  print(k, v)

This will result in:

('Company', 'Toyota')
('model', 'Premio')
('year', 2012)

The object returned by items() can also be used to show the changes that have been implemented in the dictionary:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

x = example_dict.items()

print(x)

example_dict["model"] = "Mark X"

print(x)

This code illustrates that when you change a value in the dictionary, the items object is also updated to reflect this change:

dict_items([('Company', 'Toyota'), ('model', 'Premio'), ('year', 2012)])
dict_items([('Company', 'Toyota'), ('model', 'Mark X'), ('year', 2012)])

fromkeys() Method

This method returns a dictionary having specified keys and values. It takes the syntax given below:

dictionary.fromkeys(keys, value)

The value for the required keys parameter is iterable and it specifies the keys for the new dictionary. The value for the value parameter is optional and it specifies the default value for all the keys. The default value for this is None.

Suppose we need to create a dictionary of three keys all with the same value, say 25:

name = ('John', 'Nicholas', 'Mercy')
age = 25

example_dict = dict.fromkeys(name, age)

print(example_dict)

Let's verify that fromkeys() method created the dictionary we've described:

{'John': 25, 'Mercy': 25, 'Nicholas': 25}

As expected, the fromkeys() method was able to pick the keys and combine them with the value 25 to create the dictionary we wanted.

The value for the keys parameter is mandatory. The following example demonstrates what happens when the value for the values parameter is not specified:

name = ('John', 'Nicholas', 'Mercy')

example_dict = dict.fromkeys(name)

print(example_dict)

In this case, None was used as the default value:

{'John': None, 'Mercy': None, 'Nicholas': None}

setdefault() Method

This method is applicable when we need to get the value of the element with the specified key. If the key is not found, it will be inserted into the dictionary alongside the specified default value.

The method takes the following syntax:

dictionary.setdefault(keyname, value)

In this method, the keyname parameter is required. It represents the keyname of the item you need to return a value from. The value parameter is optional. If the dictionary already has the key, this parameter won't have any effect. If the key doesn't exist, then the value given in this method will become the value of the key. It has a default value of None:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

x = example_dict.setdefault("color", "Gray")

print(x)

The dictionary doesn't have the key for color. The setdefault() method has inserted this key and the specified value, that is, Gray, has been used as its value:

Gray

The following example demonstrates how the method behaves if the value for the key does exist:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

x = example_dict.setdefault("model", "Allion")

print(x)

The value Allion has no effect on the dictionary since we already have a value for the key model:

Premio

keys() Method

This method also returns an iterable object. The object returned is a list of all keys in the dictionary. And just like with the items() method, the returned object can be used to reflect the changes made to the dictionary.

To use this method, we only call it on the name of the dictionary, as shown below:

dictionary.keys()

For example:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

x = example_dict.keys()

print(x)

This results in:

dict_keys(['model', 'Company', 'year'])

Often times this method is used to iterate through each key in your dictionary:

example_dict = {
  "Company": "Toyota",
  "model": "Premio",
  "year": 2012
}

for k in example_dict.keys():
  print(k)

This will print each key of the example_dict in a separate line:

Company
model
year

Conclusion

This marks the end of this guide on Python dictionaries. These dictionaries store data in key-value pairs. The key acts as the identifier for the item while the value is the value of the item. The Python dictionary comes with a variety of methods that can be applied for the retrieval or manipulation of data. In this article, we saw how a Python dictionary can be created, modified, and deleted along with some of the most commonly used dictionary methods.

April 13, 2022 12:15 PM UTC


David Amos

Revisiting Rock Paper Scissors in Python

Revisiting Rock Paper Scissors in Python

When you learn to program for the first time, you look for (or, perhaps, are assigned) projects that reinforce basic concepts. But how often do you, once you&aposve attained more knowledge and experience, revisit those beginner projects from the perspective of an an advanced programmer?

In this article I want to do just that. I want to revisit a common beginner project — implementing the game "Rock Paper Scissors" in Python — with the knowledge I&aposve gained from nearly eight years of Python programming experience.

🎓
Are you a beginner? Don&apost click away! You can still learn a lot from this article.

Table of Contents

The Rules Of "Rock Paper Scissors"

Before diving into code, let&aposs set the stage by outlining how "Rock Paper Scissors" is played. Two players each choose one of three items: rock, paper, or scissors. The players reveal their selection to each other simulataneously and the winner is determined by the following rules:

  1. Rock beats scissors
  2. Scissors beats paper
  3. Paper beats rock

Growing up, my friends and I used "Rock Paper Scissors" to solve all sorts of problems. Who gets to play first in a one-player video game? Who gets the last can of soda? Who has to go pick up the mess we just made? Important stuff.

The Requirements

Let&aposs lay out some requirements for the implementation. Rather than building a full-blown game, let&aposs focus on writing a function called play() that accepts two string arguments — the choice of "rock", "paper", or "scissors" selected by each player — and returns a string indicating the winner (e.g., "paper wins") or if the game results in a tie (e.g., "tie").

Here are some examples of how play() is called and what it returns:

>>> play("rock", "paper")
&apospaper wins&apos

>>> play("scissors", "paper")
&aposscissors wins&apos

>>> play("paper", "paper")
&apostie&apos

If one or both of the two arguments are invalid, meaning they aren&apost one of "rock", "paper", or "scissors", then play() should raise some kind of exception.

play() should also be commutative. That is, play("rock", "paper") should return the same thing as play("paper", "rock").

The "Beginner" Solution

To set a baseline for comparison, consider how a beginner might implement the play() function. If this beginner is anything like I was when I first learned to program, they&aposd probably start writing down a whole bunch of if statements:

def play(player1_choice, player2_choice):
    if player1_choice == "rock":
        if player2_choice == "rock":
            return "tie"
        elif player2_choice == "paper":
            return "paper wins"
        elif player2_choice == "scissors":
            return "rock wins"
        else:
            raise ValueError(f"Invalid choice: {player2_choice}")
    elif player1_choice == "paper":
        if player2_choice == "rock":
            return "paper wins"
        elif player2_choice == "paper":
            return "tie"
        elif player2_choice == "scissors":
            return "rock wins"
        else:
            raise ValueError(f"Invalid choice: {player2_choice}")
    elif player1_choice == "scissors":
        if player2_choice == "rock":
            return "rock wins"
        elif player2_choice == "paper":
            return "scissors wins"
        elif player2_choice == "scissors":
            return "tie"
        else:
            raise ValueError(f"Invalid choice: {player2_choice}")
    else:
        raise ValueError(f"Invalid choice: {player1_choice}")

Strictly speaking, there&aposs nothing wrong with this code. It runs without error and meets all of the requirements. It&aposs also similar to a number of high-ranking implementations for the Google search "rock paper scissors python."

Experienced programmers will quickly recognize a number of code smells, though. In particular, the code is repetitive and there are many possible execution paths.

Advanced Solution #1

One way to implement "Rock Paper Scissors" from a more advanced perspective involves leveraging Python&aposs dictionary type. A dictionary can map items to those that they beat according to the rules of the game.

Let&aposs call this dictionary loses_to (naming is hard, y&aposall):

loses_to = {
    "rock": "scissors",
    "paper": "rock",
    "scissors": "paper",
}

loses_to provides a simple API for determining which item loses to another:

>>> loses_to["rock"]
&aposscissors&apos

>>> loses_to["scissors"]
&apospaper&apos

A dictionary has a couple of benefits. You can use it to:

  1. Validate chosen items by checking for membership or raising a KeyError
  2. Determine a winner by checking if a value loses to the corresponding key

With this in mind, the play() function could be written as follows:

def play(player1_choice, player2_choice):
    if player2_choice == loses_to[player1_choice]:
        return f"{player1_choice} wins"
    if player1_choice == loses_to[player2_choice]:
        return f"{player2_choice} wins"
    if player1_choice == player2_choice:
        return "tie"

In this version, play() takes advantage of the built-in KeyError raised by the loses_to dictionary when trying to access an invalid key. This effectively validates the players&apos choices. So if either player chooses an invalid item — something like "lizard" or 1234play() raises a KeyError:

>>> play("lizard", "paper")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in play
KeyError: &aposlizard&apos

Although the KeyError isn&apost as helpful as a ValueError with a descriptive message, it still gets the job done.

The new play() function is much simpler than the original one. Instead of handling a bunch of explicit cases, there are only three cases to check:

  1. player2_choice loses to player1_choice
  2. player1_choice loses to player2_choice
  3. player1_choice and player2_choice are the same

There&aposs a fourth hidden case, however, that you almost have to squint to see. That case occurs when none of the other three cases are true, in which case play() returns a None value.

But... can this case ever really occur? Actually, no. It can&apost. According to the rules of the game, if player 1 doesn&apost lose to player 2 and player 2 doesn&apost lose to player 1, then both players must have chosen the same item.

In other words, we can remove the final if block from play() and just return "tie" if neither of the other two if blocks execute:

def play(player1_choice, player2_choice):
    if player2_choice == loses_to[player1_choice]:
        return f"{player1_choice} wins"
    if player1_choice == loses_to[player2_choice]:
        return f"{player2_choice} wins"
    return "tie"

We&aposve made a tradeoff. We&aposve sacrificed clarity — I&aposd argue that there&aposs a greater cognitive load required to understand how the above play() function works compared to the "beginner" version — in order to shorten the function and avoid an unreachable state.

Was this trade off worth it? I don&apost know. Does purity beat practicality?

Advanced Solution #2

The previous solution works great. It&aposs readable and much shorter than the "beginner" solution. But it&aposs not very flexible. That is, it can&apost handle variations of "Rock Paper Scissors" without rewriting some of the logic.

For instance, there&aposs a variation called "Rock Paper Scissors Lizard Spock" with a more complex set of rules:

  1. Rock beats scissors and lizard
  2. Paper beats rock and Spock
  3. Scissors beats paper and lizard
  4. Lizard beats Spock and paper
  5. Spock beats scissors and rock

How can you adapt the code to handle this variation?

First, replace the string values in the loses_to dictionary with Python sets. Each set contains all of the items that lose to the corresponding key. Here&aposs what this version of loses_to looks like using the original "Rock Paper Scissors" rules:

loses_to = {
    "rock": {"scissors"},
    "paper": {"rock"},
    "scissors": {"paper"},
}

Why sets? Because we only care about what items lose to a given key. We don&apost care about the order of those items.

To adapt play() to handle the new loses_to dictionary, all you have to do is replace == with in to use a membership check instead of an equality check:

def play(player1_choice, player2_choice):
    #                 vv--- replace == with in
    if player2_choice in loses_to[player1_choice]:
        return f"{player1_choice} wins"
    #                 vv--- replace == with in
    if player1_choice in loses_to[player2_choice]:
        return f"{player2_choice} wins"
    return "tie"

Take a moment to run this code and verify that everything still works.

Now replace loses_to with a dictionary implementing the rules for "Rock Paper Scissors Lizard Spock." Here&apos what that looks like:

loses_to = {
    "rock": {"scissors", "lizard"},
    "paper": {"rock", "spock"},
    "scissors": {"paper", "lizard"},
    "lizard": {"spock", "paper"},
    "spock": {"scissors", "rock"},
}

The new play() function works with these new rules flawlessly:

>>> play("rock", "paper")
&apospaper wins&apos

>>> play("spock", "lizard")
&aposlizard wins&apos

>>> play("spock", "spock")
&apostie&apos

In my opinion, this is a great example of the power of picking the right data structure. By using sets to represent all of the items that lose to a key in the loses_to dictionary and replacing == with in, you&aposve made a more general solution without having to add a single line of code.

Advanced Solution #3

Let&aposs step back and take a slightly different approach. Instead of looking up items in a dictionary to determine the winner, we&aposll build a table of all possible inputs and their outcomes.

You still need something to represent the rules of the game, so let&aposs start with the loses_to dict from previous solution:

loses_to = {
    "rock": {"scissors"},
    "paper": {"rock"},
    "scissors": {"paper"},
}

Next, write a function build_results_table() that takes a rules dictionary, like loses_to, and returns a new dictionary that maps states to their results. For instance, here&aposs what build_results_table() should return when called with loses_to as its argument:

>>> build_results_table(loses_to)
{
    {"rock", "scissors"}: "rock wins",
    {"paper", "rock"}: "paper wins",
    {"scissors", "paper"}: "scissors wins",
    {"rock", "rock"}: "tie",
    {"paper", "paper"}: "tie",
    {"scissors", "scissors"}: "tie",
}

If you think something looks off there, you&aposre right. There are two things wrong with this dictionary:

  1. Sets like {"rock", "rock"} can&apost exist. Sets can&apost have repeated elements. In a real scenario, this set would look like {"rock"}. You don&apost actually need to worry about this too much. I wrote those sets with two elements to make it clear what those states represent.
  2. You can&apost use sets as dictionary keys. But we want to use sets because they take care of commutativity for us automatically. That is, {"rock", "paper"} and {"paper", "rock"} evaluate equal to each other and should therefore return the same result upon lookup.

The way to get around this is to use Python&aposs built-in frozenset type. Like sets, frozensets suport membership checks, and they compare equal to another set or frozenset if and only if both sets have the same members. Unlike standard sets, however, frozenset instances are immutable. As a result, they can be used as dictionary keys.

To implement build_results_table() you could loop over each of the keys in the loses_to dictionary and build a frozenset instance for each of the strings values in the set corresponding to the key:

def build_results_table(rules):
     results = {}
     for key, values in rules.items():
         for value in values:
             state = frozenset((key, value))
             result = f"{key} wins"
             results[state] = result
     return results

This gets you about halfway there:

>>> build_results_table(loses_to)
{frozenset({&aposrock&apos, &aposscissors&apos}): &aposrock wins&apos,
 frozenset({&apospaper&apos, &aposrock&apos}): &apospaper wins&apos,
 frozenset({&apospaper&apos, &aposscissors&apos}): &aposscissors wins&apos}

The states that result in a tie aren&apost covered, though. To add those, you need to create frozenset instances for each key in the rules dictionary that map to the string "tie":

def build_results_table(rules):
     results = {}
     for key, values in rules.items():
         # Add the tie states
         results[frozenset((key,))] = "tie"  # <-- New
         # Add the winning states
         for value in values:
             state = frozenset((key, value))
             result = f"{key} wins"
             results[state] = result
     return results

Now the value returned by build_results_table() looks right:

>>> build_results_table(loses_to)
{frozenset({&aposrock&apos}): &apostie&apos,
 frozenset({&aposrock&apos, &aposscissors&apos}): &aposrock wins&apos,
 frozenset({&apospaper&apos}): &apostie&apos,
 frozenset({&apospaper&apos, &aposrock&apos}): &apospaper wins&apos,
 frozenset({&aposscissors&apos}): &apostie&apos,
 frozenset({&apospaper&apos, &aposscissors&apos}): &aposscissors wins&apos}

Why go through all this trouble? After all, build_results_table() looks more complicated than the play() function from previous solution.

You&aposre not wrong, but I want to point out that this pattern can be quite useful. If there are a finite number of states that can exist in a program, you can sometimes see dramatic boosts in speed by precalculating the results for all of those states. This might be overkill for something as simple as "Rock Paper Scissors," but could make a huge difference in situations where there are hundreds of thousands or even millions of states.

One real-world scenario where this type of approach makes sense is the Q-learning algorithm used in reinforcement learning applications. In that algorithm, a table of states — the Q-table — is maintained that maps each state to a set of probabilities for some pre-determined actions. Once an agent is trained, it can choose and action based on the probabilities for an observed state and then act accordingly.

Often, a table like the one generated by build_results_table() is computed and then stored in a file. When the program runs, the pre-computed table gets loaded into memory and then used by the application.

So, now that you have a function that can build a results table, assign the table for loses_to to an outcomes variable:

outcomes = build_results_table(loses_to)

Now you can write a play() function that looks up the state in the outcomes table based on the arguments passed to play and then returns the result:

def play(player1_choice, player2_choice):
    state = frozenset((player1_choice, player2_choice))
    return outcomes[state]

This version of play() is incredibly simple. Just two lines of code! You could even write it as a single line if you wanted to:

def play(player1_choice, player2_choice):
    return outcomes[frozenset((player1_choice, player2_choice))]

Personally, I prefer the two-line version over the single-line version.

Your new play() function follows the rules of the game and is commutative:

>>> play("rock", "paper")
&apospaper wins&apos

>>> play("paper", "rock")
&apospaper wins&apos

play() even raises a KeyError if it gets called with an invalid choice, but the error is less helpful now that the keys of the outcomes dictionary are sets:

>>> play("lizard", "paper")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 21, in play
    return outcomes[state]
KeyError: frozenset({&aposlizard&apos, &apospaper&apos})

The vague error would likely not be an issue, however. In this article, you&aposre only implementing the play() function. In a true implementation of "Rock Paper Scissors" you&aposd most likely capture user input and validate that before ever passing the user&aposs choice to play().

So, how much faster is this implementation versus the previous ones? Here&aposs some timing results to compare performance of the various inmplementations using IPython&aposs %timeit magic function. play1() is the version of play() from the Advanced Solution #2 section, and play2() is the current version:

In [1]: %timeit play1("rock", "paper")
141 ns ± 0.0828 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [2]: %timeit play2("rock", "paper")
188 ns ± 0.0944 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In this case, the solution using the results table is actually slower than the previous implementation. The culprit here is the line that converts the function arguments to a frozenset. So, although dictionary lookups are fast, and building a table that maps states to outcomes can potentially improve performance, you need to be careful to avoid expensive operations that may end up negating whatever gains you expect to get.

Conclusion

I wrote this article as an exercise. I was curious to know how I&aposd approach a beginner project like "Rock Paper Scissors" in Python now that I have a lot of experience. I hope you found it interesting. If you have any inkling of inspiration now to revisit some of your own beginner projects, then I think I&aposve done my job!

If you do revist some of your own beginner projects, or if you&aposve done so in the past, let me know how it went in the comments. Did you learn anything new? How different is your new solution to the one you wrote as a beginner?

What Inspired This Article?

An aquantaince from the Julia world, Miguel Raz Guzmán Macedo, turned me on to a blog post by Mosè Giordano. Mosè leverages Julia&aposs multiple dispatch paradigm to write "Rock Paper Scissors" in less than ten lines of code:

Rock–paper–scissors game in less than 10 lines of code
Rock–paper–scissors is a popular hand game. However, some nerds may prefer playing this game on their computer rather than actually shaking their hands. Image credit: Enzoklop, Wikimedia Commons, CC-BY-SA 3.0 We can write this game in less than 10 lines of code in the Julia programming language. Thi…
Revisiting Rock Paper Scissors in Python

I won&apost get into the details of how Mosè&aposs code works. Python doesn&apost even support multiple dispatch out-of-the-box. (Although you can use it with some help from the plum package.)

Mosè&aposs article got my mental gears spinning and encouraged me to revisit "Rock Paper Scissors" in Python to think about how I could approach the project differently.

As I was working through the solution, however, I was reminded of an article I did a reviewed for Real Python quite some time ago:

Make Your First Python Game: Rock, Paper, Scissors! – Real Python
In this tutorial, you’ll learn to program rock paper scissors in Python from scratch. You’ll learn how to take in user input, make the computer choose a random action, determine a winner, and split your code into functions.
Revisiting Rock Paper Scissors in Python

It turns out the first two solutions I "invented" here are similar to the solution that Chris Wilkerson, the author of Real Python&aposs article, came up with.

Chris&aposs solution is more full-featured. It includes an interactive gameplay mechanism and even uses Python&aposs Enum type to represent game items. That must have also been where I first heard of "Rock Paper Scissors Lizard Spock."


Did you enjoy this article? Stay up-to-date with all of my content, get early access to my courses, and get hand-picked content from around the Python and Julia communities straight to your inbox every Friday by signing up for my weekly Curious About Code newsletter.

April 13, 2022 01:31 AM UTC


Brian Okken

Current Git CLI workflow

Workflow Most of my interactions with git CLI, especially for quick changes, is: $ git checkout main $ git pull $ git checkout -b okken_something < code changes > $ git commit -a -m 'quick message' $ git push Then the code review and merge happen on the server. Commands Let’s break that down. git checkout main Start at the main branch. git pull Grab any changes from remote repo.

April 13, 2022 12:00 AM UTC

April 12, 2022


PyCoder’s Weekly

Issue #520 (April 12, 2022)

#520 – APRIL 12, 2022
View in Browser »

The PyCoder’s Weekly Logo


The Oregon Trail in Python

In the 1971 text-based game, the player guides a party of settlers along the Oregon Trail. Random events occur and death abounds. Learn how to write this adventure game in Python.
KEITH FOSTER

Build a Site Connectivity Checker in Python

In this step-by-step project, you’ll build a Python site connectivity checker for the command line. While building this app, you’ll integrate knowledge related to making HTTP requests with standard-library tools, creating command-line interfaces, and managing concurrency with asyncio and aiohttp.
REAL PYTHON

Pinpoint Python Errors in Seconds With Datadog Application Performance Monitoring

alt

Datadog’s Application Performance Management generates detailed flame graphs to help you identify bottlenecks and latency in your Python code. Navigate seamlessly between Python app traces, logs and metrics to troubleshoot and resolve application performance issues fast. Try Datadog APM free →
DATADOG sponsor

Python f-Strings Are More Powerful Than You Might Think

Learn about the lesser-known features of Python’s f-strings, including date formatting, variable debugging, nested f-strings, and conditional formatting.
MARTIN HEINZ • Shared by Martin Heinz

PSF Welcomes New Executive Director Deb Nicholson

PYTHON SOFTWARE FOUNDATION

11th Annual PyLadies Auction

PYCON.BLOGSPOT.COM

Cython Is 20!

STEFAN BEHNEL

Python Release Python 3.11.0a7 (Last Alpha)

PYTHON.ORG

Discussions

Projects for a Self-Taught Dev to Help Get a Job?

REDDIT

Which Modules Have You Used to Automate Work?

TWITTER.COM/DRISCOLLIS

Python Jobs

Python Technical Architect (USA)

Blenderbox

Academic Innovation Software Dev (Ann Arbor, MI, USA)

University of Michigan

Software Development Lead (Ann Arbor, MI, USA)

University of Michigan

Senior Platform Engineer (USA)

Parade

Senior Backend Software Engineer (USA)

Parade

Senior Backend Software Engineer (USA)

Clay

Advanced Python Engineer (Newport Beach, CA, USA)

Research Affiliates

Lead Software Engineer (Anywhere)

Right Side Up

Data Engineer (Chicago, IL, USA)

Aquatic Capital Managment

More Python Jobs >>>

Articles & Tutorials

YAML: The Missing Battery in Python

In this tutorial, you’ll learn all about working with YAML in Python. By the end of it, you’ll know about the available libraries, their strengths and weaknesses, and the advanced and potentially dangerous features of YAML. You’ll also serialize Python objects and create a YAML syntax highlighter.
REAL PYTHON

Creating Better Error Messages for Python 3.10 & 3.11

What goes into creating those enhanced error messages in the latest versions of Python? How does the new PEG parser help to pinpoint where errors have occurred? This week on the show, Pablo Galindo Salgado talks about the work that goes into creating these improvements.
REAL PYTHON podcast

CData Software — The Easiest Way to Connect Python With Data

alt

Connect, Integrate, & Automate your data from any other application or tool in real-time, on-premise or cloud, with simple data access to more than 250 cloud applications and data sources. Learn more at cdata.com →
CDATA SOFTWARE sponsor

Python REST APIs With FastAPI

In this course, you’ll learn the main concepts of FastAPI and how to use it to quickly create web APIs that implement best practices by default. By the end of it, you will be able to start creating production-ready web APIs.
REAL PYTHON course

10 Patterns for Writing Cleaner Python

Cleaner code is more focused, easier to read, easier to debug, and generally easier to maintain. This guide covers ten different patterns Python programmers should apply in their code.
ALEX OMEYER

Exploring the World of Declarative Programming

The Prolog programming language can be use as a Python library through pyswip. This article introduces you to declarative coding and how to embed it in your Python code.
PAMPELMUSE

React in Python With Pyodide

Pyodide is a Web Assembly implementation of Python, this article talks about how to use it to write React hooks using Pyodide instead of Javascript.
XING HAN LU

Take Your Github Repository to the Next Level

How to spice up your GitHub repo pages, making them more discoverable, readable, and more useful to the coding community.
ELUDA

Discovering Basic Blocks in Python Bytecode

Examine Python bytecode by diving deep into reading it and constructing a control-flow graph programmatically.
MAX BERNSTEIN

Level Up Your Analytics

Work with your MongoDB data via Apache Arrow, numPy, and Pandas. Seamless, effective analysis with one line of code. Try it today and use Code MKT-PYCODER to redeem $25 of free MongoDB Atlas credits!
MONGODB sponsor

Running Minor Tasks With a Simple Job System in Django

Learn how to write a small job processing mechanism to do background work in your Django project.
MANOS PITSIDIANAKIS

Gotchas of Early-Bound Function Argument Defaults in Python

Exploring how mutable default arguments in Python can cause some surprising behaviors.
REDOWAN DELOWAR • Shared by Redowan Delowar

Projects & Code

vedo: Easy-to-Use 3D Graphics in Python

GITHUB.COM/MARCOMUSY • Shared by Tommy Vandermolen

CheekyKeys: Hands-Free Coding With Facial Gestures

GITHUB.COM/EVERYTHINGISHACKED

direnv: Shell Extension That Loads .env per Directory

GITHUB.COM/DIRENV

rembg: Tool to Remove an Image’s Background

GITHUB.COM/DANIELGATIS

samila: Generative Art Generator

GITHUB.COM/SEPANDHAGHIGHI

Events

PyCon DE & PyData Berlin 2022

April 11 to April 14, 2022
PYCON.DE

Santa Cruz Python Meetup

April 13, 2022
MEETUP.COM

Weekly Real Python Office Hours Q&A (Virtual)

April 13, 2022
REALPYTHON.COM

Heidelberg Python Meetup

April 13, 2022
MEETUP.COM

PyCamp Spain 2022

April 15 to April 19, 2022
PYCAMP.ES


Happy Pythoning!
This was PyCoder’s Weekly Issue #520.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

April 12, 2022 07:30 PM UTC


Python Morsels

Python f-string tips & cheat sheets

Python's string formatting syntax is both powerful and complex. Let's break it down and then look at some cheat sheets.

Table of contents

  1. What are we talking about?
  2. Definitions
  3. Example f-strings
  4. Formatting numbers
  5. Formatting strings
  6. Formatting datetime objects
  7. Forcing a programmer-readable representation
  8. Self-documenting expressions & debugging
  9. Cheat sheets
  10. Summary

What are we talking about?

Python's string formatting syntax allows us to inject objects (often other strings) into our strings.

>>> name = "Trey"
>>> print(f"My name is {name}. What's your name?")
My name is Trey. What's your name?
>>> name = "Trey"
>>> print(f"My name is {name}. What's your name?")
My name is Trey. What's your name?

We can even embed expressions:

>>> name = "Trey"
>>> print(f"{name}, which starts with {name[0]}")
Trey, which starts with T
>>> name = "Trey"
>>> print(f"{name}, which starts with {name[0]}")
Trey, which starts with T

But Python's string formatting syntax also allows us to control the formatting of each of these string components.

There is a lot of complexity in Python's string formatting syntax. If you're just for quick answers, skip to the cheat sheets section.

Definitions

Let's start with some definitions. …

Read the full article: https://www.pythonmorsels.com/string-formatting/

April 12, 2022 04:45 PM UTC


Andre Roberge

Natural syntax for units in Python

In the past week, there has been an interesting discussion on Python-ideas about Natural support for units in Python. As I have taught introductory courses in Physics for about 20 of the 30 years of my academic career, I am used to stressing the importance of using units correctly, but had never had the need to explore what kind of support for units was available in Python. I must admit to have been pleasantly surprised by many existing libraries.

In this blog post, I will give a very brief overview of parts of the discussion that took, and is still taking place, on Python-ideas about this topic. I will then give a very brief introduction to two existing libraries that provide support for units, before showing some actual code inspired by the Python-ideas discussion.

But first, putting my Physics teacher hat on, let me show you some partial Python code that I find extremely satisfying, and which contains a line that is almost guaranteed to horrify programmers everywhere, as it seemingly reuse the variable "m" with a completely different meaning.

>>> g = 9.8[m/s^2]
>>> m = 80[kg]
>>> weight = m * g
>>> weight
<Quantity(784.0, 'kilogram * meter / second ** 2')>
>>> tolerance = 1.e-12[N]
>>> abs(weight - 784[N]) < tolerance
True

Discussion on Python-ideas

The discussion on Python-ideas essentially started with the suggestion that "it would be nice if Python's syntax supported units".  That is, if you could basically do something like:

length = 1m + 3cm
# or even
length = 1m 3cm

and it just worked as "expected". Currently, identifiers in Python cannot start with a number, and writing "3cm" is a SyntaxError. So, in theory, one could add support for this type of construct without causing any backward incompatibility.

While I never thought of it before, as I use Python as a hobby, I consider the idea of supporting handling units correctly to be an absolute requirement for any scientific calculations. Much emphasis is being made on adding type information to ensure correctness: to my mind, adding *unit* information to ensure correctness is even more important than adding type information.

During the course of the discussion on Python-ideas, other possible suggestions were made, some of which are actually supported by at least a couple of existing Python libraries. These suggestions included constructs like the following:

length = 1*m + 3*cm
speed = 4*m / 1*s # or speed = 4 * m / s

length = m(1) + cm(3)
speed = m_s(4)

length = 1_m + 3_cm
speed = 4_m_s

length = 1[m] + 3[cm]
speed = 4[m/s]

length = 1"m" + 3"m"
speed = 4"m/s"

density = 1.0[kg/m**3]
density = 1.0[kg/m3]
# No one suggested something like the following
density = 1.0[kg/m^3]

I will come back to looking at potential new syntax for units, as it currently my main interest in this topic. But first, I want to highlight one other main point of the discussion on Python-ideas, namely: Should the units be defined globally for an entire application, or locally according to the standard Python scopes?

My first thought was "of course, it should follow Python's normal scopes". 

Thinking of the opposite argument, what happen if one uses units other than S.I. units in different module, including those from external libraries?  Take for example "mile", and have a look at its Wikipedia entry. If one uses units with the same name but different values in different parts of an application, any pretense of using quantities with units to ensure accuracy goes out the window. Furthermore, many units libraries make it possible for users to define they own custom units. What happens if the same name is used for different custom units in different modules, with variables or functions using variables with units in one module are used in a second module?

Still, as long as libraries do not, or cannot change unit definitions globally, and if they provide clear and well-documented access to the units they use, then the normal Python scopes would likely be the best choice.

[For a detailed discussion of these two points of view, have a look at the thread on Python-ideas mentioned above. There doesn't seem to be a consensus as to what the correct approach should be.]

A brief look at two unit libraries

There are many unit libraries available on Pypi. After a brief look at many of them, I decided to focus on only two: astropy.units and pint. These seemed to be the most complete ones currently available, with source code and good supporting documentation available.

I will first look at an example that shows how equivalent description of units are easily handled in both of them. First, I use the units module from astropy:

>>> from astropy import units as u
>>> p1 = 1 * u.N / u.m**2
>>> p1
<Quantity 1. N / m2>
>>> p2 = 1 * u.Pa
>>> p1 == p2
True

Next, doing the same with pint.

>>> import pint
>>> u = pint.UnitRegistry()
>>> p1 = 1 * u.N / u.m**2
>>> p1
<Quantity(1.0, 'newton / meter ** 2')>
>>> p2 = 1 * u.Pa
>>> p1 == p2
True

In astropy, all the units are defined in a single module.  Instead of prefacing the units with the name of the module, one can import units directly

>>> from astropy.units import m, N, Pa
>>> p1 = 1 * N / m**2
>>> p2 = 1 * Pa
>>> p1 == p2
True

The same cannot be done with pint.

A custom syntax for units

As I was reading posts from the discussion on Python-ideas, I was thinking that it might be fun to come up with a way to "play" with some code written in a more user-friendly syntax for units. After reading the following, written by Matt del Valle, I decided that I should definitely do it.

My personal preference for adding units to python would be to make instances of all numeric classes subscriptable, with the implementation being roughly equivalent to:

def __getitem__(self, unit_cls: type[T]) -> T: return unit_cls(self)

We could then discuss the possibility of adding some implementation of units to the stdlib. For example:

from units.si import km, m, N, Pa

3[km] + 4[m] == 3004[m] # True 5[N]/1[m**2] == 5[Pa] # True

My first thought was to create a custom package building from and depending on astropy.units, as I had looked at it before looking at pint and found it to have everything one might need.  However, as I read its rather unusual license, I decided that I should take another approach: I chose to simply add a new example to my ideas library, making it versatile enough so that it could be used with any unit library that uses the standard Python notation for multiplication, division and power of units, which both pint and astropy do. Note that my ideas library has been created to facilitate quick experiments and is not meant to be used in production code.

First, here's an example that mimics the example given by Matt del Valle above, with what I think is an even nicer (more compact) notation.

python -m ideas -t easy_units

Ideas Console version 0.0.29. [Python version: 3.9.10]

>>> from astropy.units import km, m, N, Pa
>>> 3[km] + 4[m] == 3004[m]
True
>>> 5[N/m^2] == 5[Pa]
True

In addition to allowing '**' for powers of units (not shown above), I chose to also recognize as equivalent the symbol '^' which is more often associated with exponentiation outside of the (Python) programming world.

Let's do essentially the same example using pint instead, and follow it with a few additional lines to illustrate further.

Ideas Console version 0.0.29. [Python version: 3.9.10]

>>> import pint
>>> unit = pint.UnitRegistry()
>>> 3[km] + 4[m] == 3004[m]
True
>>> 5[N/m^2] == 5[Pa]
True
>>> pressure = 5[N/m^2]
>>> pressure
<Quantity(5.0, 'newton / meter ** 2')>
>>> pressure = 5[N/m*m]
>>> pressure
<Quantity(5.0, 'newton / meter ** 2')>

In the last example, I made sure that "N/m*m" did not follow the regular left-to-right order of operation which might have resulted in unit cancellation as we first divide and then multiply by meters.

A look at some details

Using ideas with a "verbose" mode (-v or --verbose), one can see how the source is transformed prior to its execution.  Furthermore, in the case of easy_units, sometime a "prefix" is "extracted" from the code, ensuring that the correct names are used.  Here's a very quick look.

python -m ideas -t easy_units -v

Ideas Console version 0.0.29. [Python version: 3.9.10]

>>> import pint
>>> un = pint.UnitRegistry()
===========Prefix============
un.
-----------------------------
>>> pressure = 5[N/m^2]
===========Transformed============
pressure = 5 * un.N/(un.m**2)
-----------------------------
>>> pressure
<Quantity(5.0, 'newton / meter ** 2')>

Conclusion

Prior to reading the discussion on Python-ideas, I was only vaguely aware of the existence of some units libraries available in Python, and had no idea about their potential usefulness. Many unit libraries are, in my opinion, much  less user-friendly than astropy and pint. Still, I do find the requirements to add explicit multiplication symbols to be more tedious and much less readable than the alternative that I have shown.  While introducing a syntax like the one I have shown would not cause any backward incompatibilities, I doubt very much that anything like it would be added to Python, as it would likely be considered to be too specific to niche applications. I find this unfortunate ... However, I know that I can use ideas in my own projects if I ever want to use units together with a friendlier syntax.

I wrote the easy_units module in just a few hours. It is likely to contain some bugs [1], and is most definitely written as a quick hack not following the best practice. If you do try it, and find some bugs, feel free to file an issue; don't bother looking at the code. ;-)

[1] Indeed, I found and fixed a couple while writing this post.

April 12, 2022 04:26 PM UTC


Real Python

Exploring Keywords in Python

Every programming language has special reserved words, or keywords, that have specific meanings and restrictions around how they should be used. Python is no different. Python keywords are the fundamental building blocks of any Python program.

In this video course, you’ll find a basic introduction to all Python keywords along with other resources that will be helpful for learning more about each keyword.

By the end of this video course, you’ll be able to:


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

April 12, 2022 02:00 PM UTC


PyCon

PyCon US 2022 Welcomes 8 Early-Stage Companies To Startup Row



It's official! The Startup Row lineup has been finalized, and we're very excited to give eight early-stage companies a unique opportunity to share what they're working on with the vibrant, diverse group of folks who come to PyCon US.

Selected through a competitive application process, Startup Row companies receive the perks of conference sponsorship without the financial burden. Many companies which previously presented on Startup Row come back to PyCon as sponsors. Ultimately, though, Startup Row is the PSF's way of giving back to technical entrepreneurs who expand and enrich the Python ecosystem. Popular tools like Pandas, Plotly, and Docker were all created by Startup Row companies. And some of the companies coming to Salt Lake City may soon become the standard-bearers of their categories.

After a challenging couple of years, it's going to be great to get back to the in-person conference experience, and 2022's Startup Row batch is perhaps the most impressive yet.

In alphabetical order, let's...

Meet The 2022 Startup Row Batch

Chainguard

The pace and scale of software supply chain attacks ramped up significantly over the past couple of years, imposing enormous time and financial costs, and in some cases jeopardizing the security of important data. "Chainguard is here to make securing the software supply chain part of the software development workflow, making the secure way to do things the easiest way to do things," said the company’s head of developer education, Lisa Tagliaferri.

Deeply rooted in the open source ecosystem, the remote-first startup aims to drive adoption of open source software like SigStore, SLSA, and Tekton to assess and validate software integrity throughout the development and implementation lifecycle. Founded and led by core maintainers of some of these projects, the company is finding early commercial traction with large enterprises in regulated industries such as financial services and healthcare while continuing to promote and contribute to the open source software ecosystem. The company is investor-backed, with institutional capital coming from Amplify Partners and an array of individual angel investors.

Devron

Not so long ago, federated machine learning was the purview of academic AI researchers and software skunkworks in big tech companies. A technique which facilitates ML model training and tuning across distributed data sources, federated machine learning is a privacy-preserving methodology capable of delivering accurate, explainable ML models without the need for large centralized datasets, computing resources, and cumbersome pre-processing to maintain compliance with privacy regulations and internal best practices.

Founded in 2020 by CEO Kartik Chopra, a former undercover CIA technical intelligence officer, Devron aims to deliver the benefits of federated machine learning techniques to enterprises and government agencies that seek to harness data that may be scattered throughout various parts of their organizations. Prolific investment firm Tiger Global led Devron's Series A round in February 2022, which saw participation from prior investors FinTech Collective, Afore Capital, and Essence Venture Capital.

OpenBlender

When it comes to ML model performance, context can sometimes make all the difference. Most production ML models are trained on a dataset that's unique to the task at hand, but enriching that proprietary data with contextual external data can often result in more robust, reliable ML models. But as anyone who works with data—from first-timers to PhDs—can tell you, connecting external data to internal data is rarely as simple as importing a couple of CSVs and typing out a pd.merge() function.

OpenBlender is a San Diego, CA-based startup building a toolkit that enables data science teams to enrich their internal datasets with publicly-avaiblable geospatial and time series data from any source without the hassle of building data extraction, processing, reconciliation, and integration workflows just to access a couple of features. Named a 2021 "Cool Vendor" in data for AI and machine learning by Gartner, OpenBlender indexes and updates an ever-expanding library of external datasets which can be pulled, updated, and blended by time or location via Pandas and R dataframes.

OtterTune

One of the more tedious aspects of managing complex software systems is configuration–the process of flipping virtual switches and turning virtual knobs to first get a system up and running, and then to maximize performance and reliability of that system. A key element of most software architectures, having a properly configured database is important, but manual tuning can be costly and time-consuming.

OtterTune, a startup spun out of the Carnegie Mellon Database Group and led by CMU Professor Andy Pavlo, builds upon academic research into a machine learning-powered framework for automatically configuring database management systems (DBMSs) for each type of workload they're tasked with. The scope and complexity of modern systems is beyond the comprehension of any individual person, the company says, so machine learning techniques are helpful in discovering and setting the hundreds of virtual knobs where they need to be to achieve optimal system performance. OtterTune currently supports cloud-based MySQL and PostgreSQL databases provided by Amazon Aurora and Amazon RDS, as well as certain on-premise database implementations. As an academic project, OtterTune was partially funded by the National Science Foundation; as a startup, the company raised seed funding led by Accel.

Ploomber

Facilitating rapid exploration, manipulation, and processing of datasets, notebook-style environments have changed the way data scientists do their work. However, the ease of pasting a block of data-crunching code from notebook to another can result in some serious maintenance headaches for those tasked with re-plumbing a collection of notebooks. Creating and maintaining one data processing pipeline that can be used across multiple notebooks and run in the cloud can streamline the workflows of data science teams.

Ploomber is building open source cloud infrastructure to help data scientists bring best practices from the software engineering world into their workflows. Ploomber helps them build and deploy repeatable, maintainable data processing pipelines that can be shared across local notebooks and execute in a cloud environment. The New York City-based company participated in the most recent Y Combinator accelerator batch. Its main GitHub repository has over 2,300 stars and Ploomber's open source package is currently downloaded over 15,000 times per month.

Rasgo

The performance of a machine learning model is largely dependent on the quality of the data it was trained on, and the process of sourcing raw data and transforming it into something that's usable for ML is both time-consuming and tiresome.

Rasgo is a New York-based company which offers a suite of data prep power tools, both through a free Python package and a low-code web interface. The company also recently unveiled RasgoQL, an open source Python package that enables data teams to write Python locally while executing SQL on a cloud-hosted data warehouse. Aimed at automating aspects of the data transformation and feature engineering, the company's goal is to help data science and machine learning teams accelerate time to value: from raw data to robust features for training ML models. The Rasgo team is emphatic in its support for the open source community and at the time of writing, its RasgoQL package has been downloaded over 8,000 times in less than 30 days since release. The company has raised over $25 million in venture capital, with backing from Unusual Ventures and Insight Partners, among others.

Slim.AI

Containers changed the way a lot of software is built and used, especially in cloud environments. Boston-area startup Slim.AI aims to make the process of building and deploying cloud-native software just a little more seamless. Slim.AI users can discover and gain visibility into software containers, all while improving efficiency by pruning unused parts of those containers.

The company is led by Kyle Quest—the creator of DockerSlim, an open source package with over 13,100 stars on GitHub—and John Amaral, who previously led product for Cisco Cloud Security. Offering more than the container minification features found in the open source DockerSlim package, Slim.ai is building a more wholistic SaaS solution for facilitating container-driven development workflows. Founded in January 2021, Slim.AI is backed by investors including Decibel, Insight Partners, and boldstart Ventures, among others.

Union.ai

Orchestrating machine learning workflows is a challenge for large data science and AI teams operating on high-scale projects. Flyte started at Lyft as an internal effort to build a machine learning orchestration platform. The intention of Flyte is to integrate and automate diverse workflows across software-, data-, and ML engineering disciplines through one platform. Flyte has been used at Lyft, Spotify, and Freenome, among others, and was incubated by the Linux Foundation after going open source in 2020. Union.ai is positioned as an infrastructure partner, helping enterprise customers access the benefits of Flyte.

How You Can Support Startup Row Companies

There are many ways to support Startup Row companies, both at PyCon and afterward:
  • Visit Startup Row! As the name implies, it's literally a row of booths for startups at PyCon US. You can't miss it!
  • Join a Startup Row company! Many companies on Startup Row during the main conference will also be present at the PyCon Jobs Fair on Sunday, May 1st. During the main conference, don't be bashful about going up to their booths and asking about open opportunities.
  • Contribute to their open source projects! Many of the companies on Startup Row this year have strong roots in open source, and new contributors are always welcome.
  • Offer constructive feedback and help! Startups thrive on constructive feedback. Share your perspectives and offer to help out if you think it can benefit the team.
Most importantly: Be understanding. Starting something is really hard, and running an early stage startup is often like building an airplane while already airborne. It's okay if someone doesn't have all the answers yet. Learn about what they're working on, ask good questions, and let them know how you can be helpful.

Acknowledgements

Breaking the fourth wall for the conclusion here, I have so many thank-you's to deliver.

First of all, thanks so much to PyCon organizers for the incredible work you all do to make PyCon US a reality, whether it's in person or virtually. The work everyone does to deliver an engaging, welcoming, and safe experience for sponsors and attendees does not go unnoticed. (And thanks for the opportunity to help out with a small but very fun part of the conference!)

Thanks also to all the entrepreneurs who applied with their companies. The application may have taken only 20 or 30 minutes to complete, but that's a half hour that could've otherwise been spent on hiring, or talking to users, or shipping code. (Or, you know, doing something non-work related.) Thanks for taking the time and attention.

This really was one of the strongest applicant pools yet, so a penultimate thanks goes out to the selection committee that helps process and stack-rank applications. Thanks to the Startup Row alumni and other community members who were able to help out.

Finally and most importantly, thanks to the 8 companies coming to Salt Lake City at the end of the month. It's been great getting to know you over a series of emails and video chats, and it'll be even better to see you in person. Congrats again for earning a spot on Startup Row at PyCon US 2022!

April 12, 2022 01:14 PM UTC


Python for Beginners

Tuple String to Tuple in Python

Converting data from one form to another is a tedious task. In this article, we will discuss two ways to convert a tuple string to a tuple in python.

How to Convert a Tuple String to a Tuple in Python

Suppose that we are given a tuple in the form of a string as follows.

myStr = "(1,2,3,4,5)"

Now, we have to create the tuple (1,2,3,4,5) from the given string. For this, we will first remove the parenthesis and the comma “,” character. For this, we will replace the commas with spaces and parenthesis with empty string using the replace() method. The replace() method, when invoked on a string, takes the character to be replaced as the first input argument and the new character as the second input argument. We will replace the “(”, “)”, and the “,” character using the replace() method one by one as follows.

myStr = "(1,2,3,4,5)"
print("The tuple string is:", myStr)
myStr = myStr.replace("(", "")
myStr = myStr.replace(")", "")
myStr = myStr.replace(",", " ")
print("The output string is:", myStr)

Output:

The tuple string is: (1,2,3,4,5)
The output string is: 1 2 3 4 5

Now, we have obtained a string that contains numbers separated by spaces. To obtain the numbers from the string, we will now split the string using the split() method. The split() method, when invoked on a string, takes a character as an optional input argument and returns a list containing the elements after splitting the string at the specified character. If we do not give any character as an input argument, it splits the string at whitespaces. 

We will split the string using the split() method as follows.

myStr = "(1,2,3,4,5)"
print("The tuple string is:", myStr)
myStr = myStr.replace("(", "")
myStr = myStr.replace(")", "")
myStr = myStr.replace(",", " ")
myList = myStr.split()
print("The output list is:", myList)

Output:

The tuple string is: (1,2,3,4,5)
The output list is: ['1', '2', '3', '4', '5']

Now, we have obtained a list where all the numbers are present. But, they are present as strings. To obtain a list of integers, we will use the map() function and the int() function as follows.

myStr = "(1,2,3,4,5)"
print("The tuple string is:", myStr)
myStr = myStr.replace("(", "")
myStr = myStr.replace(")", "")
myStr = myStr.replace(",", " ")
myList = myStr.split()
myList = list(map(int, myList))
print("The output list is:", myList)

Output:

The tuple string is: (1,2,3,4,5)
The output list is: [1, 2, 3, 4, 5]

As we have obtained the list of integers, we will create a tuple from the list as shown below.

myStr = "(1,2,3,4,5)"
print("The tuple string is:", myStr)
myStr = myStr.replace("(", "")
myStr = myStr.replace(")", "")
myStr = myStr.replace(",", " ")
myList = myStr.split()
myList = list(map(int, myList))
myTuple = tuple(myList)
print("The output tuple is:", myTuple)

Output:

The tuple string is: (1,2,3,4,5)
The output tuple is: (1, 2, 3, 4, 5)

You can observe that we have converted the tuple string to a tuple using the replace() method, split() method, and the int() function in python.

Tuple String to a Tuple Using The eval() Function in Python

The eval() function is used to evaluate expressions. It takes a string as an input argument, traverses the string, and returns the output. We can directly convert the tuple string to a tuple using the eval() function as shown below.

myStr = "(1,2,3,4,5)"
print("The tuple string is:", myStr)
myTuple = eval(myStr)
print("The output tuple is:", myTuple)

Output:

The tuple string is: (1,2,3,4,5)
The output tuple is: (1, 2, 3, 4, 5)

Conclusion

In this article, we have discussed two ways to convert a tuple string to a tuple in python. To learn more about strings, you can read this article on string formatting in python. You might also like this article on list comprehension in python.

The post Tuple String to Tuple in Python appeared first on PythonForBeginners.com.

April 12, 2022 11:35 AM UTC


Stack Abuse

Guide to Python's append() Function

Introduction

Basic data types in Python allow us to store a single value under a symbolic name. This roughly follows the mathematical notation of variables. In a way, a name is assigned to a value, so you don't need to remember the actual value, nor its address in computer memory, just a simple, illustrative name.

But, if you need to store a collection of values under one variable name, basic data types won't do the job. You'll need to use more complex data structures. Python has four data types for storing a collection of values under the same name - Tuple, Set, Dictionary, and List. We'll focus on the latter in this article.

A List is a Python data type similar to an array in any other programming language. It stores an ordered collection of values under the same name. Also, it allows duplicate values, as well as changing values of stored elements. The only difference between a List and any usual array is that not all the elements of a List need to have the same data type (it's heterogenous). For example, one List may contain integer elements, floating-point numbers, strings, as well as other Lists, and any other data type elements:

example_list = [1, 3.14, 'abcd', [4, 3, 2, 1]]

Note: You create a Python List by listing its elements between two square brackets - [...]. Each element is separated by a comma - ,. Python has an array type, separate from Lists and shouldn't be confused with Lists.

In this guide, we'll take a look at how to add elements to the end of a List in Python, how to merge lists, etc. using the append() method, and compare it to other methods used to add elements to a List - extend() and insert().

How To Append Elements to a Python List Using append()

Appending elements to a List is equal to adding those elements to the end of an existing List. Python provides several ways to achieve that, but the method tailored specifically for that task is append(). It has a pretty straightforward syntax:

example_list.append(element)

This code snippet will add the element to the end of the example_list (which is of list type). As we've stated before, a list can contain elements of different data types. Therefore, element can be of any possible data type - int, float, str, list, tuple, and so on.

In the following sections, we'll go over some practical examples illustrating how to append an individual element to a list, as well as how to append one list to another.

Note: In the following examples, we use a List containing elements of different types.

How To Add a Single Element to the End of a Python List

Adding a single element illustrates the main purpose of the append() method in Python. Let's assume you have an example list:

example_list = [1, 3.14, 'abcd']

You would add 5 to the end of the exampe_list in the following way:

example_lsit.append(5)

Now, the example_list will have 5 added to its end:

[1, 3.14, 'abcd', 5]

How To Append One List to Another in Python

Assume you have two lists and want to append one to another:

example_list = [1, 3.14, 'abcd']
secondary_list = [4, 3, 2, 1]

The append() method doesn't provide a way to append two lists together in one method call. If you try to append those two lists using append(), the whole secondary_list will be added as a single element of the example_list, creating a nested list:

example_list.append(secondary_list)
print(example_list)

Now, the example_list contains the following elements, which are probably not what you wanted in the first place:

[1, 3.14, 'abcd', [4, 3, 2, 1]]

Appending one list into another using append() is achieved by iterating over all elements of a list we want to append and appending each of them to the original List :

for element in secondary_list:
    example_list.append(element)

print(example_list)

That way, we've appended the secondary_list to the end of the example_list:

[1, 3.14, 'abcd', 4, 3, 2, 1]

Alternatives to append() in Python

Python List has a couple more methods for adding elements besides append(). Most notably, extend() and insert(). In the following subsections, we'll get into the differences between them and the append() method.

append() vs extend()

As we've seen in previous sections, append() is intended to add one element to the end of a List. On the other hand, extend() is used to add multiple elements to the end of a List - effectively, it appends one list to another. Let's see how extend() works:

example_list = [1, 3.14, 'abcd']
secondary_list = [4, 3, 2, 1]

example_list.extend(secondary_list)
print(example_list)

Output:

[1, 3.14, 'abcd', 4, 3, 2, 1]

Note how extend() appends two lists in one call, and the append() needs to be called one time for each element of a List you want to append! It's a handy method to remember as an alternative.

append() vs insert()

There is no way to insert an element to a specific place in a List using append(), it automatically adds it to the end of a List. That's where insert() comes into view!

Unlike append() and extend(), it accepts two arguments - one is the element you want to insert, and the other is the index of that element in a List.

For example, if you want to add 'asdf' to the end of the example_list you would use example_lsit.append('asdf'), as we've seen in previous sections. But if you want to add it to a specific place, say, between 3.14 and 'abcd', you must use insert():

example_list = [1, 3.14, 'abcd']
# Insert element `asdf` on the index `2`
example_list.insert(2, 'asdf')

print(example_list)

This results in:

[1, 3.14, 'asdf','abcd']

Note the difference in indexing of the original and resulting lists. In the original example_list, the element on index 2 is 'abcd'. After adding 'asdf', it is on the index 2, and the 'abcd' is shifted to the index of 3.

Conclusion

After reading this guide, you should have a better understanding of how to use append() method on Python Lists and how it compares to other Python methods for adding elements to a List.

For a more in-depth comparison of those methods, you should definitely take a look at the following guide - append() vs extend() vs insert() in Python Lists.

April 12, 2022 10:30 AM UTC

April 11, 2022


TestDriven.io

Permissions in Django

This article looks at how to leverage Django's default permission system to assign permissions to users and groups.

April 11, 2022 10:28 PM UTC


Real Python

Python News: What's New From March 2022?

In March 2022, the Python 3.11.0a6 pre-release version became available for you to test, so you can stay on top of Python’s latest features. This release is the sixth of seven planned alpha releases before Python enters the beta phase, which is scheduled for May 5, 2022.

PEPs now have a new home with a sleek, modern theme. Also, PEP 594, which deals with removing dead batteries from the Python standard library, has been accepted. Regarding Python events, EuroPython 2022 held its call for proposals (CFP) and is currently selling tickets for the conference.

Join Now: Click here to join the Real Python Newsletter and you'll never miss another Python tutorial, course update, or post.

Let’s dive into the most exciting Python news from the past month!

Python Released Several New Versions

Almost every month, several Python versions are released. They typically add new features, fix bugs, correct security issues, and more. March 2022 was no exception. You now have several new releases to test, use, and enjoy. Read on to learn more!

Python 3.11.0a6 Became Available

The sixth alpha release of Python became available on March 7. After a week’s delay due to some internal problems, Python 3.11.0a6 is here for you to take for a test drive. Python 3.11 boasts several new features and changes:

  • PEP 657 – Include Fine-Grained Error Locations in Tracebacks
  • PEP 654 – Exception Groups and except*
  • PEP 673 – Self Type
  • PEP 646 – Variadic Generics

To learn more about the basics of these features, check out Python News: What’s New From February 2022?. Additionally, if you’d like an early dive into how fine-grained error locations can improve your coding and debugging experience, check out Python 3.11 Preview: Even Better Error Messages.

To try out the most exciting features that will come with Python 3.11 and to stay up to date with the language’s evolution, go ahead and install the new interpreter. Feel free to select your favorite installation procedure below:

$ docker pull python:3.11.0a6-slim
$ docker run -it --rm python:3.11.0a6-slim

$ pyenv update
$ pyenv install 3.11.0a6
$ pyenv local 3.11.0a6
$ python

$ git clone git@github.com:python/cpython.git
$ cd cpython/
$ git checkout v3.11.0a6
$ ./configure
$ make
$ ./python

Give it a try! Go your own way and explore the cool new features of Python 3.11 with your own hands!

Other Python Releases

Read the full article at https://realpython.com/python-news-march-2022/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

April 11, 2022 02:00 PM UTC


Mike Driscoll

PyDev of the Week: Peter Baumgartner

This week we welcome Peter Baumgartner (@pmbaumgartner) as our PyDev of the Week! Peter is a fellow Python blogger who writes about Python and data science. Peter also has a collection of interesting Jupyter Notebooks that you can use to learn from. You can see what projects Peter is working on over on GitHub.

Let's take a few moments to get to know Peter better!

Can you tell us a little about yourself (hobbies, education, etc):

I'm currently a Machine Learning Engineer at Explosion. Prior to this, I worked at a non-profit research institute called RTI International, and when I started in data science using python I worked for Deloitte. I earned my Masters in Analytics at the Institute for Advanced Analytics at NC State University. Prior to that I was a high school math teacher.

My primary hobbies are running and creating art with a pen plotter. Every weekend I try to run or volunteer at a local timed 5k put on by Parkrun, which is a really cool organization I encourage runners of any ability to check out. A pen plotter is basically a robot you program to draw - I picked one up last year and post most of my work to twitter.

These days I don't have too much time for hobbies as most of my free time is spent helping raise my now 1-year-old son, Clark. He's a really fun, adventurous kid who is enjoying exploring the world now that he's mobile.

Why did you start using Python?

After I finished my Master's degree, I was doing some contract work for a local marketing firm. My Master's program taught us everything in the SAS programming language, but SAS is expensive and in my opinion a painful language to program in, so for this contract work we had decided to use python. It was a real trial-by-fire as I had to learn python and provide useful analysis to the client. In the end, it worked out because it made me realize programming could actually be fun and not always a struggle. Ever since then I've been primarily a python user.

What other programming languages do you know and which is your favorite?

The first language I learned was Visual Basic—I took a computer science class in high school that really opened my eyes to cool things you could do in programming. In college I also took a Computer Science course that used C++, but I have absolutely no recollection of that knowledge. I also learned a bit of SAS and R during my Master's program.

Since I've been doing programming professionally, the language I've learned the most Julia, which is my favorite non-python language. What I like about it is that it was easy to learn coming from python, which I think is very important. I previously attempted to learn Rust, but the syntax and concepts were different enough that it was too difficult for me at the time. With Julia, it exposed me to the fact that there are different ways to think about and solve problems, and I could conceptually take what I had learned and apply it to how I developed python programs. It also forced me to increase my knowledge in some computer science fundamentals that I had never learned.

In general, I'd encourage everyone to learn a second programming language, but probably one syntactically close to their primary language. It was really helpful for me to learn another language after about 5 years of programming in python just to be exposed to alternative ways of how a programming language could work.

What projects are you working on now?

At Explosion, we just launched our consulting offering called spaCy Tailored Pipelines, which is leading to some very interesting applied natural language processing projects. In addition to that, I spend a lot of time reviewing how people are using our products and improving them by updating documentation, adding examples, or creating new open-source libraries to complement our tools. For example, a recent component I've developed simply counts the tokens that people see when they pass text through a spaCy pipeline. I started working on that because I noticed a lot of people were requesting this feature and they were confused about how this should work within spaCy. Another example would be a component that parses text from HTML. Often times people will have data from scraped web pages and want to do natural language processing on it, but if they just take the raw text from the HTML they ignore the structure of the document which might have some negative downstream impacts.

Which Python libraries are your favorite (core or 3rd party)?

I think this is the hardest question to answer because there are so many good ones.

Core:

3rd party: There are ones that I use almost every day: pandasspaCyumap-learnaltairtqdmpytestblacknumpy that are all amazing.

Then there are some libraries that I love and use in specific circumstances, like typerrichquestionary for CLI tools. poetry for packaging. streamlit for making simple apps. numba for faster array operations. sentence-transformers for NLP when sentences are involved. loguru for logging. shapely and vsketch for anything with my plotter.

 

How did you decide to write a Python blog?

I've had a blog for a long time but until recently I was publishing on it less than I was happy with. Recently I've been trying to reframe my writing process by recognizing that blog posts don't have to be perfect. I read a lot about good writing, and read a lot of good technical writing, and often times that puts so many constraints in my head when writing something that I never end up finishing anything. The practices for good writing would be useful if I was writing something more formal, like a book, but for my personal blog I give myself permission to not think about that stuff too much.

Where do you get your ideas from when it comes to writing articles?

Almost all of my articles are documenting things that I've recently learned. I try to think of writing a blog post as steps in the Feynman technique of learning. I used to be a teacher, so I also try and be cognizant of the Curse of Knowledge and write things down as I'm learning them, rather than after I'm done learning, then reorganize those original thoughts in the way that I think about something after I've learned it.

Is there anything else you’d like to say?

Support the developers and organizations that help make the python ecosystem great!


Thanks for doing the interview, Peter!

The post PyDev of the Week: Peter Baumgartner appeared first on Mouse Vs Python.

April 11, 2022 12:32 PM UTC


Django Weblog

Django security releases issued: 4.0.4, 3.2.13, and 2.2.28

In accordance with our security release policy, the Django team is issuing Django 4.0.4, Django 3.2.13, and Django 2.2.28. These release addresses the security issues detailed below. We encourage all users of Django to upgrade as soon as possible.

Django 2.2 has reached the end of extended support. The final security release (2.2.28) was issued today. All Django 2.2 users are encouraged to upgrade to Django 3.2 or later.

CVE-2022-28346: Potential SQL injection in QuerySet.annotate(), aggregate(), and extra()

QuerySet.annotate(), aggregate(), and extra() methods were subject to SQL injection in column aliases, using a suitably crafted dictionary, with dictionary expansion, as the **kwargs passed to these methods.

Thanks Splunk team: Preston Elder, Jacob Davis, Jacob Moore, Matt Hanson, David Briggs, and a security researcher: Danylo Dmytriiev (DDV_UA) for the report.

This issue has severity "high" according to the Django security policy.

CVE-2022-28347: Potential SQL injection via QuerySet.explain(**options) on PostgreSQL

QuerySet.explain() method was subject to SQL injection in option names, using a suitably crafted dictionary, with dictionary expansion, as the **options argument.

This issue has severity "high" according to the Django security policy.

Affected supported versions

  • Django main branch
  • Django 4.0
  • Django 3.2
  • Django 2.2

Resolution

Patches to resolve the issue have been applied to Django's main branch and to the 4.0, 3.2, and 2.2 release branches. The patches may be obtained from the following changesets.

CVE-2022-28346:

CVE-2022-28347:

The following releases have been issued:

The PGP key ID used for this release is Mariusz Felisiak: 2EF56372BA48CD1B.

General notes regarding security reporting

As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance or the django-developers list. Please see our security policies for further information.

April 11, 2022 07:55 AM UTC


William Minchin

AutoLoader Plugin 1.1.0 for Pelican Released

AutoLoader is a plugin for Pelican, a static site generator written in Python.

AutoLoader is a “meta plugin” in that it doesn’t directly affect your Pelican site, but rather works to make your other plugins better. By way of background, Pelican 4.5 added the ability to autoload plugins that exist in the pelican.plugins namespace. This plugin allows you to extend this autoload ability to any arbitrary namespace. In particlar, it defaults to extending this ability to my minchin.pelican.plugins namespace, and thus will autoload my other plugins, if installed. It can also be used to add plugin autoloading to earlier version of Pelican.

This Release

This release adds the ability to disable auto-loading of specific plugins. In particular, it defaults to no longer trying to load pelican.plugins.signals and pelican.plugins._utils which are modules within the pelican.plugins namespace, but are not actually plugins.

Upgrading

The simplest way to upgrade (or install) AutoLoader is to use pip:

pip install minchin.pelican.plugins.autoloader --upgrade

No configuration changes are needed.

If you want to use these new features, define AUTOLOADER_PLUGIN_BLACKLIST in your pelicanconf.py:

# pelicanconf.py

from minchin.pelican.plugins import autoloader

AUTOLOADER_PLUGIN_BLACKLIST = autoloader.DEFAULT_PLUGIN_BLACKLIST + [
    "pelican.plugins.misbehaving_plugin",
    # other plugins
]

Known Issues

April 11, 2022 01:56 AM UTC