Planet Python

Last update: January 31, 2021 07:46 AM UTC

January 31, 2021

The Three of Wands

On structured and unstructured data, or the case for cattrs

If you've ever gone through the Mypy docs, you might have seen the section on TypedDict. The section goes on to introduce the feature by stating:

Python programs often use dictionaries with string keys to represent objects. [...] you can use a TypedDict to give a precise type for objects like movie, where the type of each dictionary value depends on the key:

from typing_extensions import TypedDict

Movie = TypedDict('Movie', {'name': str, 'year': int})

movie = {'name': 'Blade Runner', 'year': 1982}  # type: Movie

In other words, TypedDict exists to make dictionaries a little more like classes (in the eyes of Mypy, in this particular case), and is only one example of a growing menagerie of similar efforts to make dictionaries classes.

In this post, I maintain that in modern Python classes already exist, are fit-for-purpose and dictionaries should just be left to be dictionaries.

Value Objects

Pretty much every application and every API has a notion of data models on some level. These are prime examples of structured data - pieces of information with a defined shape (usually the names and types of subfields). The TypedDict example from the introduction defines a data model with two fields. Let's call these pieces of data value objects. Value objects come in a million flavors on many different abstraction layers; they can range from a Django model to a class you define in a one-liner to be able to return multiple values from a function, to just a dictionary. Value objects usually don't have a lot of business logic attached to them so it might be a stretch calling some of these value objects, but let's roll with it here.

In Python, the most natural way of modeling value objects is a class; since an instance of a class is just that - a piece of structured data.

When the TypedDict docs claim that Python programs often use dictionaries to model value objects, they aren't incorrect. The reason for this is, however, that historically Python has not had good tools for using classes for value objects, not that dictionaries are actually good or desireable for this purpose. Let's look at why this is the case.

JSON Value Objects

One of the biggest reasons, I believe, is JSON, probably the most popular serialization format of our time. Python has great tools for converting a piece of JSON into unstructured data (Python primitives, lists and dictionaries) - there's a JSON library included in Python's standard library, and very robust, well-known and performant third-party JSON libraries. Pretty much all Python HTTP libraries (client and server) have special cases for easy handling of JSON payloads.

Now, take into account that the most straightforward way to model a value object in JSON is simply using a JSON object with fields corresponding to the value object fields. So, parsing the JSON payload {"name": "Blade Runner", "year": 1982} into a dictionary is extremely easy, and converting this into a proper Python value object much less so.

Modern Python Value Objects

Historically, creating Python value object classes and populating them with data from somewhere (like a JSON payload) has been very cumbersome. There have been three recent development in the broader Python ecosystem to make this much better.

attrs

We now have attrs. attrs is a Python library for declaratively defining Python classes, and is particularly amazing for modeling value objects. attrs itself has excellent docs and makes a great case against manually writing classes (which it whimsically calls artisinal classes) here. The example nicely illustrates the amount of code needed for a well-behaved value object. No wonder the Python ecosystem used to default to dictionaries.

A small note on dataclasses: the dataclasses module is basically a subset clone of attrs present in the Python standard library. In my opinion, the only use of dataclasses is if you don't have access to third-party libraries (i.e. attrs), for example if you're creating simple scripts that don't require a virtual environment or are writing code for the standard library. If you can use pip you should be using attrs instead, since it's just better.

Field-level type annotations

We now (since Python 3.6) have field-level type annotations in classes (aka PEP 526).

This makes it possible to define a value object thusly:

@attr.define
class Movie:
    name: str
    year: int

The most important part of this PEP is that the type information for the value object fields is available at runtime. (Classes like this were possible before this PEP using type comments, but that's not usable in runtime.)

The field type information is necessary for handling structured data; especially any kind of nested structured data.

cattrs

We now have cattrs. cattrs is my library for efficiently converting between unstructured and structured Python data. To simplify, cattrs ingests dictionaries and spits out classes, and ingests classes and spits out dictionaries. attrs classes are supported out of the box, but anything can be structured and unstructured. For example, the usage docs show how to convert Pendulum DateTime instances to strings, which can then be embedded in JSON.

cattrs uses converters to perform the actual transformations, so the un/structuring logic is not on the value objects themselves. This keeps the value objects leaner and allows you to use different rules for the same value object, depending on the context.

So cattrs is the missing layer between our existing unstructured infrastructure (our JSON/msgpack/bson/whatever libraries) and the rich attrs ecosystem, and the Python type system in general. (cattrs goes to efforts to support higher-level Python type concepts, like enumerations and unions.)

I believe this functionality is sufficiently complex for it to have a layer of its own and that it doesn't really make sense for lower-level infrastructure (like JSON libraries) to implement it itself, since the conversion rules between higher-level components (like Pendulum DateTimes) and their serialized representations need to be very customizable. (In other words, there's a million ways of dumping DateTimes to JSON.)

Also, if the unstructured layer only concerns itself with creating unstructured data, the structuring logic can be in one place. In other words, if you use ujson + cattrs, you can easily switch to msgpack + cattrs later (or at the same time).

Putting it all to use

Let's try putting this to use. Let's say we want to load a movie from a JSON HTTP endpoint.

First, define our value object in code. This serves as documentation, runtime information for cattrs, and type information for Mypy.

@attr.frozen
class Movie:
    name: str
    year: int

Second, grab the unstructured JSON payload.

>>> payload = httpx.get('http://my-movie-url.com/movie').json()

Third, structure the data into our value object (this will throw exceptions if the data is not the shape we expect). If our data is not exotic and doesn't require manual customization, we can just import structure from cattr and use that.

>>> movie = cattr.structure(payload, Movie)

Done!

Addendum: What should dictionaries actually be used for?

The attrs docs already have a great section on what dictionaries should be, so I'll be short in adding my two cents.

If the value type of your dictionary is any sort of union, it's not really a dictionary but a value object in disguise. For the movie example, the type of the dictionary would be dict[str, Union[str, int]], and that's a tell-tale sign something's off (and the raison d'etre for TypedDict). A true dictionary would, for example, be a mapping of IDs to Movies (if movies had IDs), the type of which would be dict[int, Movie]. There's no way to turn this kind of data into a class.

January 31, 2021 01:41 AM UTC

January 30, 2021

John Cook

Python triple quote strings and regular expressions

There are several ways to quote strings in Python. Triple quotes let strings span multiple lines. Line breaks in your source file become line break characters in your string. A triple-quoted string in Python acts something like “here doc” in other languages.

However, Python’s indentation rules complicate matters because the indentation becomes part of the quoted string. For example, suppose you have the following code outside of a function.

x = """\
abc
def
ghi
"""

Then you move this into a function foo and change its name to y.

def foo():
    y = """\
    abc
    def
    ghi
    """

Now x and y are different strings! The former begins with a and the latter begins with four spaces. (The backslash after the opening triple quote prevents the following newline from being part of the quoted string. Otherwise x and y would begin with a newline.) The string y also has four spaces in front of def and four spaces in front of ghi. You can’t push the string contents to the left margin because that would violate Python’s formatting rules.

We now give three solutions to this problem.

Solution 1: textwrap.dedent

There is a function in the Python standard library that will strip the unwanted space out of the string y.

import textwrap 

def foo():
    y = """\
    abc
    def
    ghi
    """
    y = textwrap.dedent(y)

This works, but in my opinion a better approach is to use regular expressions [1].

Solution 2: Regular expression with a flag

We want to remove white space, and the regular expression for a white space character is \s. We want to remove one or more white spaces so we add a + on the end. But in general we don’t want to remove all white space, just white space at the beginning of a line, so we stick ^ on the front to say we want to match white space at the beginning of a line.

import re 

def foo():
    y = """\
    abc
    def
    ghi
    """
    y = re.sub("^\s+", "", y)

Unfortunately this doesn’t work. By default ^ only matches the beginning of a string, not the beginning of a line. So it will only remove the white space in front of the first line; there will still be white space in front of the following lines.

One solution is to add the flag re.MULTILINE to the substitution function. This will signal that we want ^ to match the beginning of every line in our multi-line string.

    y = re.sub("^\s+", "", y, re.MULTILINE)

Unfortunately that doesn’t quite work either! The forth positional argument to re.sub is a count of how many substitutions to make. It defaults to 0, which actually means infinity, i.e. replace all occurrences. You could set count to 1 to replace only the first occurrence, for example. If we’re not going to specify count we have to set flags by name rather than by position, i.e. the line above should be

    y = re.sub("^\s+", "", y, flags=re.MULTILINE)

That works.

You could also abbreviate re.MULTILINE to re.M. The former is more explicit and the latter is more compact. To each his own. There’s more than one way to do it. [2]

Solution 3: Regular expression with a modifier

In my opinion, it is better to modify the regular expression itself than to pass in a flag. The modifier (?m) specifies that in the rest of the regular the ^ character should match the beginning of each line.

    y = re.sub("(?m)^\s+", "", y)

One reason I believe this is better is that moves information from a language-specific implementation of regular expressions into a regular expression syntax that is supported in many programming languages.

For example, the regular expression

    (?m)^\s+

would have the same meaning in Perl and Python. The two languages have the same way of expressing modifiers [3], but different ways of expressing flags. In Perl you paste an m on the end of a match operator to accomplish what Python does with setting flasgs=re.MULTILINE.

One of the most commonly used modifiers is (?i) to indicate that a regular expression should match in a case-insensitive manner. Perl and Python (and other languages) accept (?i) in a regular expression, but each language has its own way of adding modifiers. Perl adds an i after the match operator, and Python uses

    flags=re.IGNORECASE

    flags=re.I

as a function argument.

More on regular expressions

[1] Yes, I’ve heard the quip about two problems. It’s funny, but it’s not a universal law.

[2] “There’s more than one way to do it” is a mantra of Perl and contradicts The Zen of Python. I use the line here as a good-natured jab at Python. Despite its stated ideals, Python has more in common with Perl than it would like to admit and continues to adopt ideas from Perl.

[3] Python’s re module doesn’t support every regular expression modifier that Perl supports. I don’t know about Python’s regex module.

The post Python triple quote strings and regular expressions first appeared on John D. Cook.

January 30, 2021 03:36 PM UTC

Weekly Python StackOverflow Report

(cclxi) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2021-01-30 15:03:53 GMT

pip install failing on python2 - [21/3]
Installing pip is not working in bitbucket CI - [17/2]
Python Pip broken wiith sys.stderr.write(f"ERROR: {exc}") - [16/2]
JSON serialized object gives error with multiprocessing calls - TypeError: XXX objects not callable error - [7/0]
Efficient way of making time increment strings? - [6/5]
How to calculate the distance between two points on lines in python - [6/4]
How to format to n decimal places in Python - [6/3]
Read .pptx file from s3 - [5/1]
Why is iterating over a dict so slow? - [5/1]
Why is the output not 1 and 1? - [5/1]

January 30, 2021 03:04 PM UTC

Ben Cook

How to pad arrays in NumPy: Basic usage of np.pad() with examples

The np.pad() function has a complex, powerful API. But basic usage is very simple and complex usage is achievable!

January 30, 2021 12:55 PM UTC

Matt Layman

Are Django and Flask Similar?

Maybe you’re new to web development in Python, and you’ve encountered the two most popular Python web frameworks, Django and Flask, and have questions about which one you should use. Are Django and Flask similar tools for building web applications? Yes, Django and Flask share many similarities and can both make great websites, but they have some different development philosophies which will attract different types of developers. What do I know about this?

January 30, 2021 12:00 AM UTC

January 29, 2021

Daniel Roy Greenfeld

Recap of 2020 and Resolutions for 2021

Recap of 2020

I got a job saving the planet

For years I've been worried about climate change. I've tried to live a mostly ecologically friendly life, and this year I dived in with making compost for the garden. However, seeing the lack of consistent political will to correct the problem has had me worried.

So I was delighted that when I started to look for jobs in the autumn I was contacted by several firms attempting to address the problem. I've always thought that finance is a great motivator, albeit all-too-often the results are negative (the petroleum and coal industries are examples).

Ultimately I went with Octopus Energy, a company that is to borrow a phrase, disrupting the legacy power industry in areas with free-er energy markets (UK, Australia, Germany, Texas, Japan). They have embraced renewable energies as a source of power, which is good because it is cheaper than fossil fuels. Even if customers don't believe in global climate change they won't be able to ignore the fact renewables are a cheaper source of energy.

In any case, I can't begin to say how wonderful it is being part of a company whose mission is to address climate change. It's like all my skills and talents are now superpowers being used to save the world.

If you live in the US, UK, Germany, Australia, or Japan and want to join me using our various skills and talents to help save the world, here's your chance. We've got lots of openings, including for those who know Python and Django.

My daughter grew

Uma grew in size and intellect. The photo below demonstrates the changes in her over just 8 months. Seeing her expand in awareness and understanding is a journey I never imagined. The challenges in raising her have been met with happiness by myself and Audrey. Her presence also drove me toward the new job at Octopus Energy I took on in the autumn.

Uma growing

I learned to garden

Audrey and I once again tried our hand at gardening. In the past, our efforts were stymied by travel and work. Now, being home 99.9% of the year we finally managed to garner some success besides growing flowers.

Bounty for 2020

I got into composting. The science of it is fascinating, you mix kitchen scraps and yard waste, add water, and depending on the method in anywhere from days to weeks to months you have soil that plants love.

I quarantined

Rather than risk me, family, and friends members, I entered quarantine until the COVID-19 pandemic was under control. It's not been easy, there are times I am restless. I've missed events, the worst of which are funerals of those who died. Nevertheless, I refused to become a statistic or cause others harm.

I voted

I voted on November 3rd. I believe voting should be mail-in or by the internet. I also believe the United States should get rid of gerrymandering and the electoral college.

Popular vote all the way!

I wrote books

More on this in a different blog post, but here's what I got out the door:

Two Scoops of Django 3.x - Latest edition of our Two Scoops of Django series
A Wedge of Django - Basic Django tutorial
Necronaut - Horror science fiction novelette

Resolutions for 2021

Be the best father I can be

My daughter is part of my legacy, what will exist on this planet after I am gone. She motivates me to do my best at work and to take care of myself better so I can be stronger for her.

I will help her grow into the person she wants to be. That means teaching her new things, keeping her safe, and being there for her. I will treasure the time I have with her at this age and never take her presence for granted.

Lose my COVID-19

Yeah, I put on weight. Working to get it off. Steps I've taken:

Eating less
Ramping up my home exercise program
Working at a standing desk

Once the pandemic is over I plan to restart my journey into Brazilian Jiu-Jitsu.

I will quarantine

I don't need to travel to do my job. I can get by without sitting in a restaurant to eat. I can talk to family, friends, and co-workers via any number of communication methods.

Even after I'm vaccinated, until the pandemic is under control, I'll keep up this behavior.

Yes, I know some of you reading this disagree with me. You may think that I'm being too cautious. However, right now in Los Angeles huge numbers of people I know or are in my neighborhood ignored all the quarantine rules. The result is that the hospitals are overwhelmed, the staff are exhausted, and the ambulances impossible to get because they can't offload patients. Non-COVID emergency issues go untreated, meaning surviving treatable allergic reactions or heart attacks is no longer so certain.

Yeah, I'm staying in quarantine and not doing anything stupid with me or my family's safety until things die down.

January 29, 2021 11:45 PM UTC

Ben Cook

Iterating over rows in Pandas

When you absolutely have to iterate over rows in a Pandas DataFrame, use the .itertuples() method.

January 29, 2021 09:00 PM UTC

PyCharm

The Transition to Apple Silicon

In June of last year, Apple announced that the Mac would transition to Apple’s own chips, called Apple Silicon. Here at PyCharm, this would mean major changes to the way we build our software. A change of this magnitude has not happened since the move from PowerPC to Intel’s x86 architecture.

Although the performance was somewhat acceptible on Rosetta 2, Apple’s new translation engine that translates the x86 instruction set to the M1’s ARM-base instruction set, it was not good enough for our IDEs.

In general, if you have a simple program then Rosetta 2 should be able to translate your program without significant overhead. However, our IDEs are built on top of our own custom Java Runtime Environment, and that is in no way a simple program.

JetBrains Runtime

Up until 2010, Apple bundled their own version of Java with their operating system. This meant that every time a new version of Java was released, Apple would need to patch it for their own operating system, so that it did not have any security vulnerabilities. With the deprecation of Java on the Mac, certain things such as font-rendering on retina screens became more difficult using the version of Java that Oracle released. In order to remedy this, JetBrains forked the OpenJDK project in order to facilitate better control over how the IDEs looked on Macs a well as other HiDPI screens; JetBrains Runtime was born and we bundled it with our IDEs from 2014.

The JetBrains Runtime ships with all our IDEs and although this gives us more control, it also means that we need to have a large team to maintain this codebase. Furthermore, there are many facets of the runtime, and we do not know every little crevice of it, rather we focus on the part of the code that handles the rendering of UI on screens.

M1 Enters the Chat

The change to Apple Silicon meant that we’d need to re-write a lot of JetBrains Runtime, to make sure that we had adequate performance. Although we had been experimenting with running applications on Raspberry Pi computers, this was a completely different issue; the M1 meant that ARM-based computers would soon become mainstream. Our IDEs couldn’t just run adequately on the M1, they had to run well on them.

To this end, we began to investigate how we could handle this transition with grace. It soon turned out that we had to re-write a lot of the JIT system, a core component of the JVM itself, which was something we had little to no experience in.

Eventually, we did manage to solve this issue with the help of Azul Systems. To hear the whole story, listen to the podcast, where I talk to Konstantin Bulenkov, who had to weather the storm of this fundamental change.

January 29, 2021 06:07 PM UTC

Stack Abuse

How to Format Number as Currency String in Python

Introduction

Having to manually format a number as a currency string can be a tedious process. You may have just a few lines of modifications to make, however, when we need to do a fair bit of conversions, it becomes very tedious.

The first step to automating these kind of tasks will require a function. In this article, we'll be going over a few methods you can use to format numbers as currency strings in Python.

Methods for Formatting Numbers

We'll be going over three alternate libraries and functions which allow us to convert numbers into currency strings:

The locale module.
The Babel module.
The str.format() function.

The locale module is already included in Python, though, we'll have to install Babel to use it.

Format Number as Currency String with Locale

The locale module comes already preinstalled with your version of Python.

This package allows developers to localize their applications. Meaning they don't have to know in which region their software will be run, they can just write a universal codebase which will dynamically change depending on the region of use.

Initializing the Locale

To begin using the locale module you first need to set the locale:

import locale 

# To use default settings, set locale to None or leave second argument blank.
print(locale.setlocale(locale.LC_ALL, ''))

# To use a specific locale (Great Britian's locale in this case)
print(locale.setlocale(locale.LC_ALL, 'en_GB'))

The code above will produce the following output:

English_United States.1252
en_GB

To get the list of available locales, you can look it up on MS-LCID. Alternatively, you can print it out:

# For the Windows operating system 
for lang in locale.windows_locale.values():
        print(lang)

# For other operating systems
for lang in locale.locale_alias.values():
    	print(lang)

Running any of the above variants will yield something similar to:

en_GB
af_ZA
sq_AL
gsw_FR
am_ET
ar_SA
ar_IQ
ar_EG
ar_LY
ar_DZ
...

Formatting Numbers with Locale

With your preferred locale set, you can easily format number strings:

locale.setlocale(locale.LC_ALL, '')

# If you'd like groupings - set grouping to True, else set it to false or leave it out completely
print(locale.currency(12345.67, grouping=True))
print(locale.currency(12345.67))

Running the code above we get the following output:

$12,345.67
$12345.67

Using the str.format() method

The next method we'll be covering is the str.format() method, which has the advantage of being the most straight forward one:

number_string = 340020.8
# This portion is responsible for grouping the number
number_commas_only = "{:,}".format(number_string)
print(number_commas_only)

# To ensure we have two decimal places
number_two_decimal = "{:.2f}".format(number_string)
print(number_two_decimal)

# Both combined along with the currency symbol(in this case $)
currency_string = "${:,.2f}".format(number_string)
print(currency_string)

Running the code above we get the following output:

340,020.8
340020.80
$340,020.80

Though, this approach is hard-coded, unlike the previous one, which you can use to localize the formatting dynamically.

Format Number as Currency String with Babel

Using Babel is perhaps one of the lesser known methods, however it's very user-friendly and intuitive. It comes with number and currency formatting as well as other internationalizing tasks.

Unlike Python's locale module, you don't have to worry about making adjustments on a global scale.

To install Babel via pip, run the following command:

$ pip install Babel
...
Successfully installed Babel-2.9.0

Once installed, to achieve the same results as the two other methods listed above, you can simply call format_currency() on a string:

import babel.numbers
number_string = 340020.8

# The three needed arguements are the number, currency and locale
babel.numbers.format_currency(number_string, "USD", locale='en_US')

Running the code above we get the following output:

$340,020.80

To get the full list of locales available:

avail_loc = babel.localedata.locale_identifiers()
print(avail_loc)

Which looks something like this:

['af', 'af_NA', 'af_ZA', 'agq', 'agq_CM', 'ak', 'ak_GH', 'am', 'am_ET',...]

Searching For Numbers in Strings and Formatting as Currency

Sometimes, you don't work with direct numerical input, such as the input from a user. You might be working with a sentence, or a larger, unclean corpus. We can use the re module to filter through different types of input, find numerical values and format them.

Let's use all three of the approaches above to format the currency in a sentence:

import re
import locale
import babel.numbers
locale.setlocale(locale.LC_ALL, 'en_US')

Next we come up with the regex pattern needed to match the number strings:

 # This pattern is used to match any number string
 pattern = r'\d+(\.\d{1,2})?'

Next we apply the three methods we've learned to the string variable message:

message = "Our current budget is 180000, we'll need 25000.67 to cover rent, then 23400.4 for food."

# re.sub() is used to substitute substrings that match a certain pattern
# with another string, in our case the return value of a lambda function
# which will return a matching currency string.
new_message_locale = re.sub(
    pattern, lambda x: locale.currency(float(x.group()), grouping=True), message
)
new_message_str = re.sub(
    pattern, lambda x: "${:,.2f}".format(float(x.group())), message
)
new_message_babel = re.sub(
    pattern,
    lambda x: babel.numbers.format_currency(float(x.group()), "USD", locale="en_US"),
    message,
)

Let's compare the original output with the output gotten from all three methods:

print(message)
print(new_message_locale)
print(new_message_str)
print(new_message_babel)

Our current budget is 180000, we'll need 25000.67 to cover rent, then 23400.4 for food.
Our current budget is $180,000.00, we'll need $25,000.67 to cover rent, then $23,400.40 for food.
Our current budget is $180,000.00, we'll need $25,000.67 to cover rent, then $23,400.40 for food.
Our current budget is $180,000.00, we'll need $25,000.67 to cover rent, then $23,400.40 for food.

Depending on the method you prefer, the length of this script can be reduced. There are certain limitations as you may have noticed.

The script as it is, is unable to differentiate between number strings you'd like to format. However, it can be changed easily depending on your needs and use cases.

Conclusion

In this article we took a look at a couple of ways of converting numbers into proper currency strings. We've covered the str.format() method, as well as the locale and babel modules.

Finally we combined these methods with Python's regular expression module to achieve a wider range of uses. At the end I hope you were able to learn something new from all this that can help save you time.

January 29, 2021 01:30 PM UTC

Real Python

The Real Python Podcast – Episode #45: Processing Images in Python With Pillow

Are you interested in processing images in Python? Do you need to load and modify images for your Flask or Django website or CMS? Then you most likely will be working with Pillow, the friendly fork of PIL, the Python imaging library. This week on the show, we have Mike Driscoll, who is writing a new book about image processing in Python.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

January 29, 2021 12:00 PM UTC

Lucas Cimon

fpdf2.3.0 Unbreakable! and PDF quines

Unbreakable movie poster

Today, I am happy to announce version 2.3.0 of fpdf2, code name: Unbreakable!

https://github.com/pyfpdf/fpdf2/ Doc: https://pyfpdf.github.io/fpdf2/

Why Unbreakable?

As a tribute to M. Night Shyamalan movie
Because using fpdf2, your Python code can never break!
...
Just kidding, I would be …

— Permalink

January 29, 2021 11:36 AM UTC

January 28, 2021

PyCharm

PyCharm 2021.1 EAP starts now!

We are starting our new Early Access Program (EAP) for PyCharm 2021.1. You can now try the pre-release build of the upcoming v2021.1. It delivers enhanced support for Cython as well as UI and usability updates.

In short, our EAP allows anyone to try out the features we are implementing. Follow this series of EAP blog posts to get the latest information about the changes coming in PyCharm 2021.1.

pc-eap-2021-1

We encourage you to join the program to try out the new and improved features. By testing these updates and giving us feedback, you can help us make PyCharm better for you. As always, you can download the new EAP from our website, get it from the free Toolbox App, or update using snap if you’re an Ubuntu user.

DOWNLOAD PYCHARM 2021.1 EAP

Important! PyCharm EAP builds are not fully tested and might be unstable.

In this post, we’ll take a look at the most notable updates from week one of the EAP.

Improved Cython type checker

PyCharm provides improved Cython support. In this EAP we have improved the type checker for Cython, which you can already try. In the next EAP releases we are planning to fix a number of Cython-related bugs. Check out our help page for more information on this.

VCS

Built-in Space

The Space plugin is now available. This means that you can connect your IDE to your organization in JetBrains Space to view and clone project repositories, write complex scripts that use Space APIs, and review your teammates’ code. To log in to Space, click the Get from VCS button on the Welcome screen, select Space on the left, and enter your organization URL in the dedicated field. It is also possible to log in via Tools | Space | Log in to Space.

Once logged in, you can clone the desired repository and open it in PyCharm. When you open it, Space Code Reviews will appear on the left-hand pane. From there, you can see a list of issues that contain your changes or require your attention. For example, if you are a reviewer, you can open an issue to see its author, look at the timeline, add comments inside a diff view, and more.

Configure a profile for pre-commit inspections

We’ve added the possibility to choose a code inspection profile before committing changes to VCS. To access this feature, click the gear icon to show commit options, select Analyze code checkbox, click Configure, and choose the desired profile. Profiles can be created in Preferences / Settings | Editor | Inspections. The IDE will use the selected profile when inspecting your code before the commit.

User experience

Built-in HTML preview

We’ve added a new built-in browser preview to PyCharm that allows you to quickly preview HTML files. Any changes you make to HTML files in the IDE, as well as in the linked CSS and JavaScript files, will be immediately saved and the preview will update on the fly.

pc-HTML-preview

To open the preview, click on the icon with the PyCharm logo in the widget in the top-right side of the editor.

Collaborative Development

Code With Me: audio and video calls

Code With Me allows you to do collaborative development like pair programming even if your peer doesn’t have an IDE. To make the experience even better, the product team has now included the ability to make audio and video calls in Code With Me. A detailed guide can be found here.

Community contributions

The correct code insight for None is now provided in TypedDict by Morgan Bartholomew [PY-44714]
Thanks to Bernat Gabor, an important debugging case in tox is now covered: Python isolated mode no longer affects subprocess debugging. [PY-45659]

Notable bug fixes

Code insight logic is improved for ContextManager and the following other cases: function parameter annotated with built-in function type matches the passed function argument; functions that take modules as a parameter. [PY-29891] [PY-36062] [PY-43841]
For PyTorch Tensors, Tensorflow tensors, Pandas GeoDataFrame (basically for all the classes that have a shape attribute, if this attribute is non-callable and iterable) you can now see the shapes in the variable pane. [PY-19764]

That’s it for week one! You can find the rest of the changes for this EAP build in the release notes. Stay tuned for more updates, and be sure to share your feedback in the comments below, on Twitter, or via our issue tracker.

Ready to join the EAP?

Some ground rules

EAP builds are free to use and expire 30 days after the build date.
You can install an EAP build side by side with your stable PyCharm version.
These builds are not fully tested and can be unstable.
Your feedback is always welcome. Please use our issue tracker and make sure to mention your build version

How to download

Download this EAP from our website. Alternatively, you can use the JetBrains Toolbox App to stay up to date throughout the entire EAP. If you’re on Ubuntu 16.04 or later, you can use snap to get PyCharm EAP and stay up to date.

The PyCharm team

January 28, 2021 07:27 PM UTC

Martin Fitzpatrick

SAM Coupé SCREEN$ Converter — Interrupt optimizing image converter

The SAM Coupé was a British 8 bit home computer that was pitched as a successor to the ZX Spectrum, featuring improved graphics and sound and higher processor speed.

The SAM Coupé's high-color MODE4 could manage 256x192 resolution graphics, with 16 colors from a choice of 127. Each pixel can be set individually, rather than using PEN/PAPER attributes as on the Spectrum. But there's more. The SAM also supports line interrupts which allow palette entries to be changed on particular scan lines: a single palette entry can display multiple colors.

The limitation that color can only be changed per line means it's not really useful for games, or other moving graphics. But it does allow you to use a completely separate palette for "off screen" elements like panels. For static images, such as photos, it's more useful - assuming that the distribution of color in the image is favorable¹.

Demonstration SCREEN$ were a regular feature of SAM Coupé disk magazines, but interrupts were rarely used since the placement had to be done manually. Now we're living in the future, I wanted to have a crack at automating line interrupts to squeeze out as many colors as possible & let the SAM show off it's capabilities.

tip: If you just want the converter, you can get it here. It is written in Python, using Pillow for image color conversion.

First a quick look at the SAM Coupé screen modes to see what we're dealing with.

Sam Coupe Screen Modes

There are 4 screen modes on the SAM Coupé.

MODE 1 is the ZX Spectrum compatible mode, with 8x8 blocks which can contain 2 colors PAPER (background) and PEN (foreground). The framebuffer in MODE 1 is non-linear, in that line 1 is followed by line 8.
MODE 2 also uses attributes, with PAPER and PEN, but the cells are 8x1 pixels and the framebuffer is linear. This MODE wasn't used a great deal.
MODE 3 is high resolution, with double the X pixels but only 4 colours -- making it good for reading text.
MODE 4 is the high color mode, with 256x192 and independent coloring of every pixel from a palette of 16. Most games/software used this mode.

Mode	Dimensions	Framebuffer	bpp	Colors	Size	Notes
4	256×192	linear	4	16	24 KB	High color
3	512×192	linear	2	4	24 KB	High resolution
2	256×192	linear	1	16	12 KB	Color attributes for each 8x1 block
1	256×192	non-linear	1	16	6.75KB	Color attributes for each 8×8 block; matches ZX Spectrum

Most SAM Coupe SCREEN$ were in MODE 4, so that's what we'll be targeting. It would be relatively easy to support MODE 3 on top of this².

The SCREEN$ format

The format itself is fairly simple, consisting of the following bytes.

Bytes	Content
24576	Pixel data, Mode 4 4bpp: 1 byte=2 pixels; Mode 3 2bpp: 1 byte = 3 pixels
16	Mode 4 Palette A
4	Mode 3 Palette A store
16	Mode 4 Palette B
4	Mode 3 Palette B store
Variable	Line interrupts 4 bytes per interrupt (see below)
1	FF termination byte

In MODE 4 the pixel data is 4bbp, that is 1 byte = 2 pixels (16 possible colors). To handle this we can create our image as 16 colors and bit-shift the values before packing adjacent pixels into a single byte.

Palette A & B

As shown in the table above the SAM actually supports two simultaneous palettes (here marked A & B). These are full palettes which are alternated between, by default 3 times per second, to create flashing effects. The entire palette is switched, but you can opt to only change a single color. The rate of flashing is configurable with:

POKE &5A08, <value>

The <value> is the time between swaps of alternate palettes, in 50ths of a second. This is only generally useful for creating flashing cursor effects ³. For converting to SAM SCREEN$ we'll be ignoring this and just duplicating the palette.

note: The exporter supports palette flash for GIF export.

MODE 3 Store

When switching between MODE 3 and MODE 4. The palettes of MODE 3 & 4 are separate, but palette operations on the same CLUT. When changing mode 4 colors are aside to a temporary store, and replaced when switching back. These values are also saved when saving SCREEN$ files (see "store" entries above), so you can replace the MODE 3 palette by loading a MODE 4 screen. It's a bit odd.

We can ignore this for our conversions and just write a default set of bytes.

Interrupts

Interrupts define locations on the screen where a given palette entry (0-15) changes to a different color from the 127 system palette. They are encoded with 4 bytes per interrupt, with multiple interrupts appended one after another.

Bytes	Content
1	Y position, stored as 172-y (see below)
1	Color to change
1	Palette A
1	Palette B

Interrupt coordinates set from BASIC are calculated from -18 up to 172 at the top of the screen. The plot range in BASIC is actually 0..173, but interrupts can't affect the first pixel (which makes sense, since this is handled through the main palette).

When stored in the file, line interrupts are stored as 172-y. For example, a line interrupt at 150 is stored in the file as 22. The line interrupt nearest the top of the screen (1st row down, interrupt position 173) would be stored as 172-172=0.

This sounds complicated, but actually means that to get our interrupt Y byte we can just subtract 1 from the Y coordinate in the image.

Converting Image to SCREEN$

We now have all the information we need to convert an image into a SCREEN$ format. The tricky bit (and what takes most of the work) is optimising the placement of the interrupts to maximise the number of colors in the image.

Pre-processing

Processing is done using Pillow package for Python. Input images are resized and cropped to fit, using using the ImageOps.fit() method, with centering.

SAM_COUPE_MODE4 = (256, 192, 16)
WIDTH, HEIGHT, MAX_COLORS = SAM_COUPE_MODE4

im = Image.open(fn)

# Resize with crop to fit.
im = ImageOps.fit(im, (WIDTH, HEIGHT), Image.ANTIALIAS, 0, (0.5, 0.5))

If the above crop is bad, you can adjust it by pre-cropping/sizing the image beforehand. There isn't the option to shrink without cropping as any border area would waste a palette entry to fill the blank space.

Interrupts

This is the bulk of the process for generating optimized images: the optimize method is shown below -- this shows the high level steps taken to reach optimal number of colors using interrupts to compress colors.

def optimize(im, max_colors, total_n_colors):
    """
    Attempts to optimize the number of colors in the screen using interrupts. The
    result is a dictionary of color regions, keyed by color number
    """
    optimal_n_colors = max_colors
    optimal_color_regions = {}
    optimal_total_interrupts = 0

    for n_colors in range(max_colors, total_n_colors+1):
        # Identify color regions.
        color_regions = calculate_color_regions(im, n_colors)

        # Compress non-overlapping colors together.
        color_regions = compress_non_overlapping(color_regions)

        # Simplify our color regions.
        color_regions = simplify(color_regions)

        total_colors = len(color_regions)

        # Calculate home many interrupts we're using, length drop initial.
        _, interrupts = split_initial_colors(color_regions)
        total_interrupts = n_interrupts(interrupts)

        print("- trying %d colors, with interrupts uses %d colors & %d interrupts" % (n_colors, total_colors, total_interrupts))

        if total_colors <= max_colors and total_interrupts <= MAX_INTERRUPTS:
            optimal_n_colors = n_colors
            optimal_color_regions = color_regions
            optimal_total_interrupts = total_interrupts
            continue
        break

    print("Optimized to %d colors with %d interrupts (using %d palette slots)" % (optimal_n_colors, optimal_total_interrupts, len(optimal_color_regions)))
    return optimal_n_colors, optimal_color_regions

The method accepts the image to compress, a max_colors argument, which is the number of colors supported by the screen mode (16). This is a lower bound, the minimum number of colors we should be able to get in the image. The argument total_n_colors contains the total number of colors in the image, capped at 127 -- the number of colors in the SAM palette. This is the upper bound, the maximum number of colors we can use. If the total_n_colors < 16 we'll skip optimization.

Each optimization round is as follows -

calculate_color_regions generates a dictionary of color regions in the image. Each region is a (start, end) tuple of y positions in the image where a particular color is found. Each color will usually have many blocks.
compress_non_overlapping takes colors with few blocks and tries to combine them with other colors with no overlapping regions: transitions between colors will be handled by interrupts
simplify takes the resulting color regions and tries to simplify them further, grouping blocks back with their own colors if they can and then combining adjacent blocks
total_colors the length of the color_regions is now the number of colors used
split_initial_colors removes the first block, to get total number of interrupts

note: The compress_non_overlapping algorithm makes no effort to find the best compression of regions - I experimented with this a bit and it just explodes the number of interrupts for little real gain in image quality.

The optimization process is brute force - step forward, increase the number of colors by 1 and perform the optimization steps above. If the number of colors > 16 we've gone too far: we return the last successful result, with colors <= 16.

SAM Coupé Palette

Once we have the colors for the image we map the image over to the SAM Coupé palette. Every pixel in the image must have a value between 0-15 -- pixels for colors controlled by interrupts are mapped to their "parent" color. Finally, all the colors are mapped across from their RGB values to the nearest SAM palette number equivalent.

note: This is sub-optimal, since the choice of colors should really be informed by the colors available. But I couldn't find a way to get Pillow to quantize to a fixed palette without dithering.

The mapping is done by calculating the distance in RGB space for each color to each color in the SAM 127 color palette, using the usual RGB color distance algorithm.

def convert_image_to_sam_palette(image, colors=16):
    new_palette = []
    rgb = image.getpalette()[:colors*3]
    for r, g, b in zip(rgb[::3], rgb[1::3], rgb[2::3]):

        def distance_to_color(o):
            return distance(o, (r, g, b))

        spalette = sorted(SAM_PALETTE, key=distance_to_color)
        new_palette.append(spalette[0])

    palette = [c for i in new_palette for c in i]
    image.putpalette(palette)
    return image

Packing bits

Now our image contains pixels of values 0-15 we can pack the bits and export the data. we can iterate through the flattened data in steps of 2, and pack into a single byte:

pixels = np.array(image16)

image_data = []
pixel_data = pixels.flatten()
# Generate bytestream and palette; pack to 2 pixels/byte.
for a, b in zip(pixel_data[::2], pixel_data[1::2]):
    byte = (a << 4) | b
    image_data.append(byte)

image_data = bytearray(image_data)

The operation a << 4 shifts the bits of integer a left by 4, so 15 (00001111) becomes 240 (11110000), while | ORs the result with b. If a = 0100 and b = 0011 the result would be 01000011 with both values packed into a single byte.

Writing the SCREEN$

Finally, the image data is written out, along with the palette data and line interrupts.

        # Additional 4 bytes 0, 17, 34, 127; mode 3 temporary store.
        bytes4 = b'\x00\x11\x22\x7F'

        with open(outfile, 'wb') as f:
            f.write(image_data)
            # Write palette.
            f.write(palette)

            # Write extra bytes (4 bytes, 2nd palette, 4 bytes)
            f.write(bytes4)
            f.write(palette)
            f.write(bytes4)

            # Write line interrupts
            f.write(interrupts)

            # Write final byte.
            f.write(b'\xff')

To actually view the result, I recommend the SAM Coupé Advanced Disk Manager.

You can see the source code for the img2sam converter on Github.

Examples

Below are some example images, converted from PNG/JPG source images to SAM Coupé MODE 4 SCREEN$ and then back into PNGs for display. The palette of each image is restricted to the SAM Coupé's 127 colors and colors are modified using interrupts.

Pool Pool 16 colors, no interrupts

Pool Pool 24 colors, 12 interrupts (compare gradients)

This image pair shows the effect on line interrupts on a image without dither. The separation between the differently colored pool balls makes this a good candidate.

Leia Leia 26 colors, 15 interrupts

Tully Tully 22 colors, 15 interrupts

The separation between the helmet (blue, yellow components) and horizontal line in the background make this work out nicely. Same for the second image of Tully below.

Isla Isla 18 colors, 6 interrupts

Tully (2) 18 colors, 5 interrupts

Dana Dana 17 colors, 2 interrupts

Lots of images that don't compress well because the same shades are used throughout the image. This is made worse by the conversion to the SAM's limited palette of 127.

Interstellar 17 colors, 3 interrupts

Blade Runner 16 colors (11 used), 18 interrupts

This last image doesn't manage to squeeze more than 16 colors out of the image, but does reduce the number of colors used for those 16 to just 11. This gives you 5 spare colors to add something else to the image.

Converting SCREEN$ to Image

Included in the scrimage package is the sam2img converter, which will take a SAM MODE4 SCREEN$ and convert it to an image. The conversion process respects interrupts and when exporting to GIF will export flashing palettes as animations.

The images above were all created using sam2img on SCREEN$ created with img2sam. The following two GIFs are examples of export from SAM Coupe SCREEN$ with flashing palettes.

Flashing palette

Flashing palette and flashing Line interrupts

You can see the source code for the sam2img converter on Github.

An ideal image either has gradients down the image, or regions of isolated non-overlapping color. But it's hard to predict as conversion to the SAM palette can run some colors together. ↩
I experimented a bit with converting to MODE 3, but only 4 colors meant not very exciting results. ↩
With faster flash speeds (1 50th/second) you can use it to sort of merge nearby colors to create additional shades, while giving yourself a headache. ↩

January 28, 2021 02:00 PM UTC

EuroPython

EuroPython 2021: Getting ready

We're happy to announce the pre-launch website for this year's EuroPython 2021:

EuroPython 2021 – https://ep2021.europython.eu/c

The site comes with an FAQ page, which lists all the information we have for you at the moment. We're repeating the most important part here:

EuroPython 2021 will be held online from July 26 - August 1, 2021,
using the following structure:

two workshop/training days (July 26 - 27)
three conference days (July 28 - 30)
two sprint days (July 31 - August 1)

The next steps are preparing the main conference website, adding content, organizing the call for papers (CFP), setting up the ticket system, the financial aid program, getting everything tested and deployed.

Want to join the fun ?

We'll have busy weeks ahead of us. If you want to help, please consider contacting us with details on where you would like to contribute. Please write to volunteers@europython.eu.

Distributed conferencing

We are also looking into setting up what we call EuroPython Nodes, where attendees can join small groups around the world to attend EuroPython 2021 together. Please see our FAQ entry for details. The idea is still in flux and we'd like to get some feedback from user groups or companies interested in participating.

PS: We have also moved our blog off of Tumblr and onto our own infrastructure. Hope you like the new design.

Enjoy,
EuroPython 2021 Team
https://www.europython-society.org/

January 28, 2021 11:20 AM UTC

Python Software Foundation

PSF Scientific Working Group announces call for grant requests

The Scientific Working Group of the Python Software Foundation is excited to announce a next call for funding proposals in 2020/2021. The goal of the Scientific Working Group is to advance the scope, breadth, and utility of Python for scientific work. This call places specific emphasis on maintainer / maintenance support, outreach and education, as well as improved infrastructure & documentation. We would be especially excited to fund developers within the Global South.

A proposal might be, for example, to implement continuous integration or overhaul documentation for an existing scientific Python project. Funding is for a maximum of three months. A brief report, to be provided upon conclusion of the grant, will inform future extension or expansion of grants.

Funding guidelines are outlined in the working group charter, and previously funded events and projects are listed on our homepage. Funding is for up to $4,000 USD per project. Project maintainers and creators are encouraged to reach out to the Working Group if they are aware of developers who may benefit from these funds.

We look forward to sharing the work produced by our grant-holders with the Scientific Python community, and inspiring others to take advantage of these funding opportunities.

Please submit proposals as directed at https://www.python.org/psf/grants/. We have extended our deadline to consider the first batch of grant requests on February 15th, 2021, and announce decisions at the end of that month. Subsequent proposals will be evaluated on a quarterly basis.

January 28, 2021 10:09 AM UTC

Codementor

How to Run Selenium Tests Using IE Driver?

A complete tutorial on how you can automate the testing of web applications on IE using Selenium IE driver on Python, Java, C#, and PHP.

January 28, 2021 10:00 AM UTC

Python Pool

Matplotlib Annotate Explained with Examples

Hello geeks and welcome in this article, we will cover Matplotlib Annotate. Along with that, for an overall better understanding, we will also look at its syntax and parameter. Then we will see the application of all the theory part through a couple of examples.

First, let us try to develop a brief understanding of Matplotlib Annotate. But before that, I will give you an overview of the Matplotlib library. It is the plotting library of Python and an extension to the NumPy library. With this library’s help, we plot different graphs justifying our programs. It comes very handy when dealing with writing programs for data science.

Now coming back to our function Matplotlib Annotate. So, in general, the term annotate means to label something. Suppose you draw a VI characteristic graph in this, you label the x-axis as V(voltage) and the y-axis as I(current). Similarly, the function helps us label graphs generated with the help of matplotlib. It will become more clear as we discuss a few examples. But before that, in the next section, we will be looking at its syntax.

Syntax

matplotlib.pyplot.annotate()

This is the general syntax of our function. It has several parameters associated with it, which we will be covering in the next section.

PARAMETER

1. text

This parameter represents the text that we want to annotate.

2. xy

This parameter represents the Point X and Y to annotate.

3. XYText

An optional parameter represents the position where the text along X and Y needs to be placed.

4. XYCOORDS

This parameter contains the string value.

5. ARROWPROPS

This parameter is also an optional value and contains “dict” type. By default it is none.

EXAMPLES

As we are done with all the theory portion related to Matplotlib Annotate. In this section, we will be looking at how this function works and how it helps us achieve our desired output. We will start with an elementary level example and gradually move our way to more complicated examples.

1. Sine waveform

import matplotlib.pyplot as plt
import numpy as pp

fig, ppool = plt.subplots()

t = pp.arange(0.0, 1.0, 0.001)
s = pp.sin(2 * pp.pi * t)
line = ppool.plot(t, s, lw=2)

ppool.annotate('Peak Value', xy=(.25, 1),
                xytext=(1, 1),arrowprops=dict(facecolor='yellow',
                                shrink=0.05),xycoords="data",)

ppool.set_ylim(-1.5, 1.5)
plt.show()

Above we can see the very first example for our function. Here our goal is to print the sine waveform. In order to do so, we have first imported the NumPy and matplotlib library. Then after which we have used the arange function of Numpy. So what it does that the function automatically generates a value between the specified range. After which we have used the sine function 2*pi*t. After which comes our annotation part. Here XY is the point where the arrow needs to mark. Along with that, we have arrow props that contain all the data about how the arrow look. We have also used XY text and coords. Then we have specified the y-axis-limit. At last, we have plotted it.

2. Full-wave rectifier from cosine signal

import matplotlib.pyplot as plt
import numpy as pp

fig, ppool = plt.subplots()

t = pp.arange(0.0, 1.0, 0.001)
s = pp.cos(2 * pp.pi * 5 * t)
line = ppool.plot(t, s, lw=2)

# Annotation
ppool.annotate('Peak Values', xy=(0, 1),
                xytext=(1, 1),arrowprops = dict(facecolor ='green',
                                  shrink = 0.05),xycoords="data",)
plt.xlabel("Time")
plt.ylabel("output Voltage")

ppool.set_ylim(-1, 1)

# Plot the Annotation in the graph
plt.show()

In this example, we have our goal is to print the output of a full-wave rectifier for cosine signal. Here the peak to peak value is -1 to 1. The frequency of the given cosine signal is 5 Hz. Here apart from the above-mentioned things we have used a plt label. What it does is that it gives us the freedom to label the 2 axes as well.

Different application

1. Annotate Scatter Plot

We annotate a scatter plot using this method let us look at an example.

import matplotlib.pyplot as plt
y = [3.2, 3.9, 3.7, 3.5, 3.02199]
x = [0.15, 0.3, 0.45, 0.6, 0.75]
n = [155, "outliner", 293, 230, 670]

fig, ax = plt.subplots()
ax.scatter(x, y)

for i, txt in enumerate(n):
    ax.annotate(txt, (x[i], y[i]))
plt.show()

Here above in order to plot the scatter plot. At first, we have defined the coordinates along the x and y-axis. Now corresponding to each point we have declared the notations. Then we have used the for loop to automatically get the annotation value for each point.

2. Annotate Bar chart

import matplotlib.pyplot as plt
import numpy as np


labels = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri']
shop_a = [20, 33, 30, 28, 27]
shop_b = [25, 32, 33, 20, 25]

x = np.arange(len(labels))  
width = 0.35  # the width of the bars

fig, ax = plt.subplots()
rects1 = ax.bar(x - width/2, shop_a, width, label='Sales-a')
rects2 = ax.bar(x + width/2, shop_b, width, label='sales-b')


ax.set_ylabel('Sales')
ax.set_title('Sales report of 2 shops')
ax.set_xticks(x)
ax.set_xticklabels(labels)
ax.legend()


def autolabel(rects):
    for rect in rects:
        height = rect.get_height()
        ax.annotate('{}'.format(height),
                    xy=(rect.get_x() + rect.get_width() / 2, height),
                    xytext=(0, 3),  
                    textcoords="offset points", size=16,color="Green",
                    ha='center', va='bottom')


autolabel(rects1)
autolabel(rects2)

fig.tight_layout()

plt.show()

Here we have successfully performed annotation for the Bar graph. Here the graph is a comparison between the product sold by 2 shops from Monday to Friday. Another thing to pay attention is that we have customized the size and colour of the annotated text. You can simply do so by adding the color=” ” and size=” ” in annotating tag.

Error-matplotlib annotate not showing

This error occurs may occur with you when working with this function. The primary reason can be some error in your code. One of the most common mistakes is that the position of the annotation text is way above your concerned axes. In this case, the annotate will be there but you can’t see it due to size difference.

Must Read

Conclusion

In this article, we covered the Matplotlib Annotate. Besides that, we have also looked at its syntax and parameters. For better understanding, we looked at a couple of examples. We varied the syntax and looked at the output for each case. In the end, we can conclude that function Matplotlib Annotate is used to annotate graphs generated by Matplotlib.

I hope this article was able to clear all doubts. But in case you have any unsolved queries feel free to write them below in the comment section. Done reading this why not read about argpartition next.

The post Matplotlib Annotate Explained with Examples appeared first on Python Pool.

January 28, 2021 09:36 AM UTC

Stéphane Wirtel

Weekly Update, January 28, 2021

Weekly Update Past Since January, I have no mission, no customers. There is some reasons. Where you work for a customer, it’s funny because you can learn a lof of technologies. But you don’t improve your own processes and you cannot learn for yourself. That’s why since January, I have fixed a lot of things (for my company or at Home) and also started to learn new technologies. My wife and me are happy with the improvements at Home, that was really simple, but with without time, it’s just difficult.

January 28, 2021 09:19 AM UTC

Talk Python to Me

#301 Deploying and running Django web apps in 2021

Have you been learning Django and now want to get your site online? Not sure the best way to host it or the trade offs between the various options? Maybe you want to make sure your Django site is secure. On this episode, I'm joined by two Django experts Will Vincent and Carlton Gibson to talk about deploying and running Django in production along with recent updates in Django 3.2 and beyond. Links from the show <div>Guests Will Vincent: <a href="https://wsvincent.com/" target="_blank" rel="noopener">wsvincent.com</a> Carlton Gibson: <a href="https://twitter.com/carltongibson" target="_blank" rel="noopener">@carltongibson</a> Watch the live stream: <a href="https://www.youtube.com/watch?v=9SmY2b1EwwY" target="_blank" rel="noopener">youtube.com</a> Give me back my monolith: <a href="https://www.craigkerstiens.com/2019/03/13/give-me-back-my-monolith/" target="_blank" rel="noopener">craigkerstiens.com</a> Carlton’s Button hosting platform: <a href="https://btn.dev/" target="_blank" rel="noopener">btn.dev</a> Django Software Foundation: <a href="https://www.djangoproject.com/foundation/" target="_blank" rel="noopener">djangoproject.com</a> Django News newsletter: <a href="https://django-news.com/" target="_blank" rel="noopener">django-news.com</a> Deployment Checklist: <a href="https://docs.djangoproject.com/en/3.1/howto/deployment/checklist/" target="_blank" rel="noopener">djangoproject.com</a> Environs 3rd party package for environment variables: <a href="https://github.com/sloria/environs#usage-with-django" target="_blank" rel="noopener">github.com</a> Django Static Files & Templates: <a href="https://learndjango.com/tutorials/django-static-files" target="_blank" rel="noopener">learndjango.com</a> Learn Django: <a href="https://LearnDjango.com" target="_blank" rel="noopener">LearnDjango.com</a> Configuring uWSGI for Production Deployment @ Bloomberg: <a href="https://www.techatbloomberg.com/blog/configuring-uwsgi-production-deployment/" target="_blank" rel="noopener">techatbloomberg.com</a> </div> Sponsors <a href='https://talkpython.fm/square'>Square</a> <a href='https://talkpython.fm/linode'>Linode</a> <a href='https://talkpython.fm/training'>Talk Python Training</a>

January 28, 2021 08:00 AM UTC

Matt Layman

Customer Feature - Building SaaS #89

In this episode, I show you how to take a feature idea from nothing to something. We added new UI, wrote a new view, a new form, and all the associated test code to prove that the feature works. My customer wants the ability to complete a task on any day she desires. The feature flow looks like: Click a calendar icon next to a task on a student’s course page.

January 28, 2021 12:00 AM UTC

Python⇒Speed

Speed up pip downloads in Docker with BuildKit's new caching

Docker uses layer caching to speed up builds, but layer caching isn’t always enough. When you’re rapidly developing your Python application and therefore frequently changing the list of dependencies, you’re going to end up downloading the same packages.

Over and over and over again.

This is no fun when you depend on small packages. It’s extra no fun when you’re downloading machine learning libraries that take hundreds of megabytes.

With the release of a stable Docker BuildKit, Docker now supports a new caching mechanism that can cache these downloads.

January 27, 2021

RoseHosting Blog

How to install Python 3.9 on Ubuntu 20.04

Python is a free, open-source, and one of the most popular programming languages around the world. It is a versatile ...

The post How to install Python 3.9 on Ubuntu 20.04 appeared first on RoseHosting.

January 27, 2021 05:43 PM UTC

PyCharm

PyCharm 2020.3.3 Is Available

The final bug-fix release for PyCharm 2020.3 is out.

You can upgrade to v2020.3.3 with the Toolbox App, or right from the IDE, or by using snap if you are an Ubuntu user. It is also available for download from our website.

Here are the most notable fixes introduced in v2020.3.3:

It is now possible to open projects in tabs on macOS Big Sur. [JBR-2893]
Fixed some shortcut issues on Macs with an M1 chip. [JBR-2981][JBR-2999]
Fixed patch updates for Apple Silicon IDE builds. [IDEA-258792]. Please note that it is not possible to use a patch to update from v2020.3.1 to this build on Apple Silicon.
Fixed the IDE’s behavior when you double-click on a file in Local Changes. This action now opens the file in the editor. Alternatively, you can select an option to “Show Diff Instead of File Content on Double-Click” in the context menu. [IDEA-235910]
Fixed the incorrect focus when dragging a file over an IDE window in Windows 10. [IDEA-244511]
Fixed the completion of field lookups for Django models. [PY-45879]
Fixed the problem causing numpy.mean to be flagged as an unresolved reference. [PY-46169]
Resolved the issue involving the wrong cell background for deactivated colored cell mode in Powerful Data Viewer, thanks to the contribution of Daniel Schmidt. [PY-45894]
Fixed the bug with newly added JSX tags causing simultaneous editing of non-related closing tags. [WEB-49051]
In React, code completion now works for dynamically evaluated className attribute values. [WEB-43318]

You can refer to the release notes for a full list of issues resolved in this version. Update to v2020.3.3 now, and don’t forget to share your feedback with us in the comments to this post or post your suggestions to our issue tracker.

January 27, 2021 05:08 PM UTC

Real Python

Stochastic Gradient Descent Algorithm With Python and NumPy

Stochastic gradient descent is an optimization algorithm often used in machine learning applications to find the model parameters that correspond to the best fit between predicted and actual outputs. It’s an inexact but powerful technique.

Stochastic gradient descent is widely used in machine learning applications. Combined with backpropagation, it’s dominant in neural network training applications.

In this tutorial, you’ll learn:

How gradient descent and stochastic gradient descent algorithms work
How to apply gradient descent and stochastic gradient descent to minimize the loss function in machine learning
What the learning rate is, why it’s important, and how it impacts results
How to write your own function for stochastic gradient descent

Free Bonus: 5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you’ll need to take your Python skills to the next level.

Basic Gradient Descent Algorithm

The gradient descent algorithm is an approximate and iterative method for mathematical optimization. You can use it to approach the minimum of any differentiable function.

Note: There are many optimization methods and subfields of mathematical programming. If you want to learn how to use some of them with Python, then check out Scientific Python: Using SciPy for Optimization and Hands-On Linear Programming: Optimization With Python.

Although gradient descent sometimes gets stuck in a local minimum or a saddle point instead of finding the global minimum, it’s widely used in practice. Data science and machine learning methods often apply it internally to optimize model parameters. For example, neural networks find weights and biases with gradient descent.

Cost Function: The Goal of Optimization

The cost function, or loss function, is the function to be minimized (or maximized) by varying the decision variables. Many machine learning methods solve optimization problems under the surface. They tend to minimize the difference between actual and predicted outputs by adjusting the model parameters (like weights and biases for neural networks, decision rules for random forest or gradient boosting, and so on).

In a regression problem, you typically have the vectors of input variables 𝐱 = (𝑥₁, …, 𝑥ᵣ) and the actual outputs 𝑦. You want to find a model that maps 𝐱 to a predicted response 𝑓(𝐱) so that 𝑓(𝐱) is as close as possible to 𝑦. For example, you might want to predict an output such as a person’s salary given inputs like the person’s number of years at the company or level of education.

Your goal is to minimize the difference between the prediction 𝑓(𝐱) and the actual data 𝑦. This difference is called the residual.

In this type of problem, you want to minimize the sum of squared residuals (SSR), where SSR = Σᵢ(𝑦ᵢ − 𝑓(𝐱ᵢ))² for all observations 𝑖 = 1, …, 𝑛, where 𝑛 is the total number of observations. Alternatively, you could use the mean squared error (MSE = SSR / 𝑛) instead of SSR.

Both SSR and MSE use the square of the difference between the actual and predicted outputs. The lower the difference, the more accurate the prediction. A difference of zero indicates that the prediction is equal to the actual data.

SSR or MSE is minimized by adjusting the model parameters. For example, in linear regression, you want to find the function 𝑓(𝐱) = 𝑏₀ + 𝑏₁𝑥₁ + ⋯ + 𝑏ᵣ𝑥ᵣ, so you need to determine the weights 𝑏₀, 𝑏₁, …, 𝑏ᵣ that minimize SSR or MSE.

In a classification problem, the outputs 𝑦 are categorical, often either 0 or 1. For example, you might try to predict whether an email is spam or not. In the case of binary outputs, it’s convenient to minimize the cross-entropy function that also depends on the actual outputs 𝑦ᵢ and the corresponding predictions 𝑝(𝐱ᵢ):

In logistic regression, which is often used to solve classification problems, the functions 𝑝(𝐱) and 𝑓(𝐱) are defined as the following:

Again, you need to find the weights 𝑏₀, 𝑏₁, …, 𝑏ᵣ, but this time they should minimize the cross-entropy function.

Gradient of a Function: Calculus Refresher

In calculus, the derivative of a function shows you how much a value changes when you modify its argument (or arguments). Derivatives are important for optimization because the zero derivatives might indicate a minimum, maximum, or saddle point.

The gradient of a function 𝐶 of several independent variables 𝑣₁, …, 𝑣ᵣ is denoted with ∇𝐶(𝑣₁, …, 𝑣ᵣ) and defined as the vector function of the partial derivatives of 𝐶 with respect to each independent variable: ∇𝐶 = (∂𝐶/∂𝑣₁, …, ∂𝐶/𝑣ᵣ). The symbol ∇ is called nabla.

The nonzero value of the gradient of a function 𝐶 at a given point defines the direction and rate of the fastest increase of 𝐶. When working with gradient descent, you’re interested in the direction of the fastest decrease in the cost function. This direction is determined by the negative gradient, −∇𝐶.

Read the full article at https://realpython.com/gradient-descent-algorithm-python/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

January 27, 2021 02:00 PM UTC

Python Pool

Exploring numpy.ones Function in Python | np.ones

While doing some project which needs mathematical operations and calculations we use numpy library. In numpy, we have NumPy arrays which facilitate advanced mathematical and other types of operations on large numbers of data. Sometimes we need our arrays to be filled with only ones that’s when the numpy.ones come into play.

Inshort numpy.ones function returns a new numpy array filled with all ones.

Need of numpy.ones

Most of you came to this article to know how to use numpy.ones in python. So, I guess you all know pretty much about mathematics as numpy is mostly about mathematical operations on large data. We often need to fill our matrix with all ones during the matrix operation in mathematics. To achieve it in python programming without tens of code lines, we have a special function present in the numpy library. This function will help us accomplish our goal by creating an identity matrix in just two lines. That function is numpy.ones().

Now I think you might know the importance of numpy.ones() function. Let’s see how we can use and implement it in our projects.

Syntax of np.ones

numpy.ones(shape, dtype=None, order='C', like=None)

Parameters

Parameter	Mandatory/Optional	Description
shape	mandatory	The shape of the new array ex. (3, 2)
dtype	optional	The data type of the array by default it’s float
order	optional	It gives us the freedom to stock the multi-dimensional array in row-major (C-style) or column-major (Fortran-style) order in memory. By default, it’s C (row-wise).
like	optional	Experimental feature pending acceptance of NEP 35.

Return

out: ndarray

numpy.ones() function returns the array of ones with the provided shape, dtype, and order.

Let’s see the working of numpy.ones() function through some quality examples.

Example 1: Creating a One-Dimensional Array with numpy.ones

import numpy as np

one_d = np.ones(5)
print(one_d)

Output:

[1. 1. 1. 1. 1.]

Explanation:

In the above example, we have created a 1-d array with all ones in it. The example is pretty straightforward. First, we have imported the numpy array as np. After that, we have directly used the np.ones() function to create an array and passed parameter ‘5’ to the function. So we will get an array that has 5 elements in it and all are 1.

Note: The elements are having the default data type as the float. That’s why we get 1. in the array.

Example 2: Creating a Two-Dimensional Array with np.ones

import numpy as np

two_d = np.ones((3, 5))
print(two_d)

Output:

[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]

Explanation:

In example 2, we have created a 2D array with all ones with numpy.ones function. This example is quite similar to the first example we have seen. The only difference here is the shape of the array, as we need to create a 2D array with all ones in it. So we have positioned the parameter of np.ones as ‘(3, 5)’. Here the number of rows will be 3, and the number of columns will be 5.

Example 2: Creating a Three-Dimensional (3D) Array with np.ones

import numpy as np

three_d = np.ones((3, 5, 7))
print(three_d)

Output:

[[[1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]]

 [[1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]]

 [[1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1. 1. 1.]]]

Explanation:

We can also create a 3D array with all ones using the np.ones() function in Python. The steps are quite similar to the first two examples. The only difference is in the shape parameter of the ones() function. Here in the shape argument, we have placed three values because we need to construct a 3D array with all ones in it.

Here also we have not declared the d-type parameter. So by default we will get the float value in our result.

Composing an Array with Different Data Types Using numpy.ones() method

In this section of the tutorial, we will perceive, how we can create arrays consisting of all ones having different data types. The d-type or data type of the numpy.ones method can be int, float, string, etc. So, without wasting any further time let’s directly jump to the examples.

1. Creating an Array where Data Type is integer Using np.ones()

import numpy as np

array_int = np.ones((3, 5), dtype=int)
print(array_int)

Output:

[[1 1 1 1 1]
 [1 1 1 1 1]
 [1 1 1 1 1]]

Explanation:

We have created all one’s numpy array with data type as an integer in the above example. The example is pretty straightforward. But for beginners, let me briefly explain it. Here, we have imported the numpy array as np.one() function is available in the numpy library. We already have seen the working of numpy.ones() the only extra thing we have done here is by adding the d-type parameter. We have made the data type integer. Hence, you can see the example’s output, which is of type integer, not float.

2. Creating an Array where Data Type is string Using numpy.ones()

import numpy as np

array_str = np.ones((3, 5), dtype=str)
print(array_str)

Output:

[['1' '1' '1' '1' '1']
 ['1' '1' '1' '1' '1']
 ['1' '1' '1' '1' '1']]

Explanation:

So, in this case we have generated a 2d numpy array filled with all ones and the data type of the arrays is string.
This example is identical to the previous example. Here the only change is instead of integer we have have used ‘str’ as the argument in the function. Hence we will get a numpy array with string data type.

3. Composing an Array with mixed Data Type in the form of Tuples Using numpy.ones()

import numpy as np

array_tuple = np.ones((3, 5), dtype=[('x', 'int'), ('y', 'float')])
print(array_tuple)

Output:

[[(1, 1.) (1, 1.) (1, 1.) (1, 1.) (1, 1.)]
 [(1, 1.) (1, 1.) (1, 1.) (1, 1.) (1, 1.)]
 [(1, 1.) (1, 1.) (1, 1.) (1, 1.) (1, 1.)]]

Explanation:

We can specify the array elements as a tuple and specify their data types too.

Working with order Parameter of numpy.ones Function

import numpy as np
import time

dimension = [2000, 2000]

cstyle = np.ones(dimension, order="C")
fortranstyle = np.ones(dimension, order="F")

print("We're performing an column operation on both the arrays to check the performance.")

start = time.time()
np.cumsum(cstyle, axis=1)
print("Time taken to perform column operation on C-style - ", time.time()-start)

start = time.time()
np.cumsum(fortranstyle, axis=1)
print("Time taken to perform column operation on Fortran-style - ", time.time()-start)

print("\nWe're performing an row operation on both the arrays to check the performance.")

start = time.time()
np.cumsum(cstyle, axis=0)
print("Time taken to perform row operation on C-style - ", time.time()-start)

start = time.time()
np.cumsum(fortranstyle, axis=0)
print("Time taken to perform row operation on Fortran-style - ", time.time()-start)

Output:

We're performing an column operation on both the arrays to check the performance.
Time taken to perform column operation on C-style -  0.07280611991882324
Time taken to perform column operation on Fortran-style -  
0.028921842575073242

We're performing an row operation on both the arrays to check the performance.
Time taken to perform row operation on C-style -  0.026929855346679688
Time taken to perform row operation on Fortran-style -  0.0718083381652832

Explanation:

As there is no elemental difference between the C type and Fortran type, we need to use operations to check the difference. In this case, we’ll use np.cumsum() to calculate the sum by row operations and by column operations. Then we’ll check the operations on both the array types and deduce a conclusion.

We’ve first declared two numpy ones array by using different order parameters. Then we’ll use the time.time() python timer class to get the time taken in each operation. time.time() – start will give us the exact amount of time taken by the row/column operation in np.cumsum().

From the output, we can check that the column operations are faster on the Fortran type array and row operations are faster in the C type. Similarly, we can use this order parameter to make our calculations faster if we know the operations beforehand.

Numpy.ones Vs. Numpy.ones_like

	Numpy.ones	Numpy.ones_like
Syntax	`numpy.ones`(shape, dtype=None, order=’C’)	`numpy.ones_like`(a, dtype=None, order=’K’, subok=True, shape=None)
Returns	Return a new array of given shape and type, filled with ones.	Return an array of ones with the same shape and type as a given array.
Example	`import numpy as np print(np.ones([3,4],dtype=int))`	`import numpy as np two_d=np.random.rand(4,5) print(two_d) print(np.ones_like(two_d,int))`
Output	[[1 1 1 1] [1 1 1 1] [1 1 1 1]]	[[0.04153551 0.65643199 0.32287169 0.60656863 0.52434272] [0.58110881 0.47309099 0.38251042 0.86157837 0.22099276] [0.60140397 0.62901982 0.51812488 0.0887905 0.78431815] [0.85056633 0.93509825 0.20018143 0.06351918 0.40621227]] [[1 1 1 1 1] [1 1 1 1 1] [1 1 1 1 1] [1 1 1 1 1]]

Summary:

In this article, we learned how to create a 1D, 2D, and 3D numpy array of given shapes and filled with ones. We have also seen how we can play around with various data types in numpy.ones function and choose the best according to our need. Following that we learned how we can work with the underrated and typical parameter ‘order’. At last, we possess the Difference between the np.ones and numpy.ones_like.

However, if you have any doubts or questions, do let me know in the comment section below. I will try to help you as soon as possible.

Happy Pythoning!

The post Exploring numpy.ones Function in Python | np.ones appeared first on Python Pool.

January 27, 2021 01:22 PM UTC