skip to navigation
skip to content

Planet Python

Last update: November 03, 2015 10:48 AM

November 03, 2015


Peter Harkins

Have You Seen This Cache?

It looks like syntax highlighting, image thumbnails, and compiling object files. Let me explain.

 
$ time vi -i NONE -u NONE app/models/god_object.rb -c ":quit"
 
real    0m0.020s
user    0m0.010s
sys     0m0.007s

The client’s GodObject is 2,253 lines long and Vim takes .020 seconds to load it.

 
$ time vi -i NONE -u NONE --cmd "syn on" app/models/god_object.rb -c ":quit"
 
real    0m0.079s
user    0m0.070s
sys     0m0.007s

Syntax highlighting adds .059 seconds. A twentieth of a second is barely noticeable to humans. At twice the speed of the fastest blink it feels like the the smallest possible pause.

That was enough time to plant the seed of this idea.

A function is “referentially transparent” when it depends only on its arguments and, if it’s run again, any later call with the same arguments could be replaced by the value returned by the first call.

Common referentially transparent functions do things like perform arithmatic, split a string into an array, or parse the bytes of a file into a data structure representing how to color Ruby source code.

That last one is exactly the situation Vim is in: there’s some uncertainty to reading a file off disk, maybe it’s there one run and not the next, but somewhere downstream there’s a function that takes the contents of the file as its argument and returns a data structure annotating where every token starts and ends so that the frontend can highlight them in the proper colors. Any time this function is given the same bytes it generates the same data structure.

It doesn’t care what day of the week it is, how many rows are in my postgres tables, what a random number generator invents, or anything else. Stable input equals stable output.

This is very similar to a key -> value dictionary. The key is the arguments to the function. The value is whatever the function returns for those keys. Looking up the answer is the same as calculating it and, indeed, many dictionaries can be used as caches this way. For an arithmetic example in Ruby:

 
square_of = Hash.new do |hash, key|
  hash[key] = key * key
end
 
square_of[3] # => 9

When you call square_of[19] you might be running a function, you might be retrieving a cached value. It doesn’t matter unless you have a practical reason to care about the details of CPU and memory usage. This isn’t useful for a simple operation like squaring numbers, but when there’s thousands of slow steps it’s quite valuable.

Every time I open god_object.rb in vim it reparses the Ruby to figure out how to highlight it. Even if the data hasn’t changed, the function runs again. It’s referentially transparent, it’s slow enough to be noticeable, so why not cache it?

Well, maintaining this kind of cache (a “read-through cache”) has a lot of busywork. Aside from the reading and writing to some data structure, there has to be an eviction policy to determine when to throw away data that’s unlikely to be requested or to free up room for new data. People get grumpy when their text editor or web browser swells to eat two gigabytes of RAM, and they don’t connect this to usage being 10 or 50% faster as the program avoids repeating work.

Additionally, Vim would really like that cache to persist across program runs. Why re-parse a file that hasn’t changed because someone quit Vim for a few minutes?

This prompts a whole new round of busywork managing disk quota and, as large as hard drives are getting, you’d have increased hassles because a program wouldn’t be able to free up space until it happened to run again.

I was kicking this around in my head, and I realized I’d seen it done before.

When I browse my folders and see thumbnails for images, they’re stored in ~/.cache/thumbnails so that when I re-open the folder they appear instantly instead of taking a half-second per file.

When I build a C or C++ project, the compiler outputs a bunch of object (.o) files, one per input source code. If I build the project a second time, only the source files that have changed are rebuilt (though this is based on the timestamp on the source code rather than its contents – with a whole host of predictable bugs ensuing).

In fact, Python is quite similar to Ruby and generates .pyc files to cache its compilation of source code.

Which reminds me, every time I start rails server to load up my development server for this client, Ruby has to re-parse source code like Vim. (That’s not to say they should share a cache, they build different data structures and don’t want to have to synchronize releases, but it’s the same problem again.) Wait, how many files is that each time?

 
$ bundle clean --force
$ find app lib -name "*\.rb" | wc -l
750
$ find $GEM_HOME/gems -name "*\.rb" | wc -l
10543

Oh, it’s 11,293 files. That’s going to take a little while. And Ruby’s going to do it all from scratch every time it starts, even though the parsing is a referentially transparent, temptingly cacheable function.

Over in the web world, there’s a really nice cache system available called memcached that’s often used in a read-through cache. Memcached is a key -&gt value store. Memcached will evict data from the cache when it needs room, generally on a “Least Recently Used” (LRU) basis as old data is least likely to be asked for again. The usual memcached use looks like this with the dalli gem:

 
def action
  key = request.url
  page = Rails.cache.fetch key do
    # page wasn't found, so generate it
    # whatever the block returns is cached under the key,
    # and is returned for the `page` variable
  end
  render html: page, layout: false
end

Let me generalize that a little:

 
def read_or_generate *args, &blk
  key = md5sum(*args.map(:&to_s).join)
  Rails.cache.fetch key, &blk
end
 
def action
  page = read_or_generate request.url do
    # generate and return page, may not be called
  end
  render html: page, layout: false
end

Squint a little and this is our pattern again: read_or_generate takes arguments and generates or retrieves the value; we don’t care which happens. (And squint a lot more for the fact that the block is unlikely to be referentially transparent; it probably queries a database but that input is stable until the cache is deliberately cleared, or “stable enough” until it expires.)

I’d like to see a filesystem-level cache like this for Vim, for Ruby, for Python, for C, for every random program that has a referentially transparent function that might as well be a cached value. It’s enough functionality that an individual program doesn’t want to take on the problem, it wants to call a cache system. (The programs that do so usually dump to files like the image thumbnails and object files, ignoring expiration: browsing my 556M thumbnail folder shows tons of images I deleted months ago; `find ~ -name “*\.o” | wc -l` turns up 1,020 object files littered through my home directory.)

The computer would run a daemon like memcached that saved keys to disks, managed expiration, and kept the buffer to a particular size. Vim doesn’t have to take on the whole problem and I don’t have to run out of disk space because a program cached two gigs of data when I last ran it a year ago.

I went looking for this software and couldn’t find it. I’d love to set aside a gig or two of disk space to faster operations and having my directories free of .o and .pyc clutter. There’d have to be some locking (like holding file handles) so that when, say, gcc finishes compiling 30 files, it doesn’t go to link them into a binary only to find that half of them have been evicted from the cache because I was downloading podcasts at the same time.

Does this system sound useful to you?

Before you answer, I thought of something clever for a second version.

Back when Vim read god_object.rb off the system, the kernel did quite a bit of clever caching to speed up reads. The short version is that the kernel caches recently file reads and writes in RAM. Rather than allocate some amount of RAM for this, the kernel uses all the free RAM that programs haven’t asked for. When a program requests more RAM, the kernel shrinks the file cache and gives RAM to the program. There’s as much room for the cache as possible, and when there’s no room free everything continues to work (but slower).

This cache system I’m considering gets a nice benefit from this feature: if Vim caches the couple kilobytes of parsed Ruby code, it’ll probably be accessed via very fast RAM instead of even having to have the disk. The kernel has lots of very clever and reliable code for doing this responsibly, it’s a wheel that shouldn’t be reinvented.

But the clever thing is that if this cache system were in the kernel, it could use all free disk space as a cache like the kernel file cache uses all free RAM. There’d be no fixed-sized allocation to weigh convenience against resources.

This seems like a nice big win to me. Enough of one that I’m puzzled that I haven’t seen anything like it. Maybe I’m not searching well, maybe I haven’t explored enough unix esoterica. Would anyone be able to point me to something like this?

Or be able to build it with me?

November 03, 2015 05:38 AM

November 02, 2015


A. Jesse Jiryu Davis

PyMongo 3.1 Works Around A Funny Performance Flaw In Python 2

Leaf

Bernie Hackett, Anna Herlihy, Luke Lovett, and I are pleased to announce the release of PyMongo 3.1. It adds features that conform to two new cross-language driver specs: it implements the Command Monitoring Spec to help you measure performance, and it adds a GridFSBucket class to match our new GridFS Spec.

A few of our users reported that PyMongo 3 used five or ten percent of their CPU while idle, and recorded a couple hundred context switches per second. I investigated and found a slapstick performance flaw in Python 2's condition variable that was interacting badly with my concurrency design in PyMongo 3.

A Reasonable Tradeoff?

PyMongo 3 has new server discovery and monitoring logic which requires one background thread to monitor each server driver is connected to. These monitors wake every 10 seconds or, when PyMongo is actively searching for a server, every half-second. This architecture has big performance advantages over PyMongo 2's—it's faster at discovering servers, and more performant and responsive if you have a large replica set, or if your replica set's topology changes, or if some members are down or slow to respond. (More info here.)

So, I expected PyMongo 3 to cost a bit of idle CPU, because its threads wake every 10 seconds to check the servers; this is intended to cost a tiny bit of memory and load in exchange for big wins in performance and reliability. But our users reported, and I confirmed, that the cost was much more than I'd guessed.

It is a requirement of our Server Discovery And Monitoring Spec that a sleeping monitor can be awakened early if the driver detects a server failure. My monitors implement this using the Python standard library's Condition.wait with a timeout.

Aside from infrequent wakeups to do their appointed chores, and occasional interruptions, monitors also wake frequently to check if they should terminate. The reason for this odd design is to avoid a deadlock in the garbage collector: a PyMongo client's destructor can't take a lock, so it can't signal the monitor's condition variable. (See What To Expect When You're Expiring, or PYTHON-863.) Therefore, the only way for a dying client to terminate its background threads is to set their "stopped" flags, and let the threads see the flag the next time they wake. I erred on the side of prompt cleanup and set this frequent check interval at 100ms.

I figured that checking a flag and going back to sleep 10 times a second was cheap on modern machines. I was incorrect. Where did I go wrong?

Idling Hot

Starting in Python 3.2, the builtin C implementation of lock.acquire takes a timeout, so condition variables wait simply by calling lock.acquire; they're implemented as efficiently as I expected. In Python 3 on my system, an idle PyMongo client takes only 0.15% CPU.

But in Python 2, lock.acquire has no timeout. To wait with a timeout in Python 2, a condition variable sleeps a millisecond, tries to acquire the lock, sleeps twice as long, and tries again. This exponential backoff reaches a maximum sleep time of 50ms.

The author of this algorithm, Tim Peters, commented:

Balancing act: We can't afford a pure busy loop, so we have to sleep; but if we sleep the whole timeout time, we'll be unresponsive. The scheme here sleeps very little at first, longer as time goes on, but never longer than 20 times per second.

If the whole timeout is long, this is completely reasonable. But PyMongo calls the condition variable's "wait" method in a loop with a timeout of only 100ms, so the exponential backoff is restarted 10 times a second. Each time the exponential backoff restarts, it sets its wait time back to one millisecond. Overall, the condition variable is not waking 10 times a second, but many hundreds of times.

In Python 2.7.10 on my system, one idle PyMongo client takes a couple percent CPU to monitor one MongoDB server. On a production server with many Python processes, each monitoring a large replica set of MongoDB servers, the overhead could be significant. It would leave less headroom for traffic spikes or require bigger hardware.

The Simplest Solution The Could Possibly Work

I surprised myself with how simple the solution was: I ditched the condition variable. In the new code, Monitor threads simply sleep half a second between checks; every half second they wake, look to see if they should ping the MongoDB server, or if they should terminate, then go back to sleep. The early wake-up feature is gone now, but since the Server Discovery And Monitoring Spec prohibits monitors from checking servers more often than every half-second anyway, this is no real loss.

Even better, I deleted 100 lines of Python and added only 20.

The original bug-reporter Daniel Brandt wrote "results are looking very good." Nicola Iarocci, a MongoDB Master, chimed in: "Hello just wanted to confirm that I was also witnessing huge performance issues when running the Eve test suite under Python 2. With PyMongo 3.1rc0 however, everything is back to normal. Cheers!"


Links to more info about the PyMongo 3.1 release:

Image: Macpedia.

November 02, 2015 11:14 PM


Carl Chenet

My Free activities in October 2015

Personal projects: db2twitter 0.1 released – build and send tweets with values from a database – first public release – official documentation retweet 0.4 released  – store already retweeted tweets ids in a sqlite3 database – official documentation Debian : Update as NMU python-tweepy to last upstream version as a dedicated branch on the python-retweet

November 02, 2015 09:45 PM


Greg Taylor

drone-hipchat released. A HipChat plugin for Drone CI.

Since the Drone CI Plugin Marketplace didn't have one yet, I put together a quick plugin. It's written in Python instead of Go, so it won't ever be in the official plugin namespace, but it also requires substantially less boilerplate than the Go plugins. So we'll run with it because it's simple!

If this interests you, check out the Github repo and the documentation. You should be able to copy/paste that sample YAML and substitute your values. Since all Drone CI plugins are Docker containers, you'll get the benefit of automatic updates if/when I make improvements or fixes in the future.

I'm all ears for feedback, which you are encouraged to send to the issue tracker.

November 02, 2015 08:37 PM


Python Software Foundation

Register Now for PyCon 2016!

Once again, the PSF is proud to underwrite and produce the largest gathering of the international Python community at PyCon 2016!

The 2016 conference will be held in Portland, Oregon, and will take place from May 28th to June 5th -- a little later in the spring than previous PyCons, in order to accommodate the school year for our many attendees who are educators, parents, and/or students.

Those of you who have attended previous PyCons know what a fantastic event these are. Education, advocacy, community building. . . all take place at a PyCon. If you've never been, you can check out these talks from last year's PyCon 2015 in Montreal.

But nothing can fully give the full experience, the excitement and flavor, the connections forged and strengthened, the sheer intensity of spending several days with a large community of bright, energetic, and engaged Pythonistas, sharing their knowledge and skills and teaching and learning with each other, as attending a PyCon itself.

The conference schedule will begin on the weekend with tutorials, then there will be five full tracks of talks, over 100 total, during the three main conference days. As usual, development sprints will follow, offering a unique opportunity for developers to work in "dream teams" on open source projects. And of course there will be the Summits, Expo Hall, Poster Session, Sponsor Workshops, Lightning Talks, Open Spaces, Job Fair, PyLadies Auction, and last, but hardly least, the dynamic and inviting "Hallway Track," that make for such a vibrant conference. All of this, along with ample (organized, spontaneous, and even some chaotic) social and cultural activities (including the annual Opening Reception and 5K Charity Run). The venue will be the centrally-located Convention Center which will allow for easy exploration of the fabulous city of Portland, Oregon.


By Another Believer (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

As this year's PyCon organizer, Brandon Rhodes, tells us on the PyCon blog,

PyCon offers tremendous value for both individuals and businesses. PyCon’s three main conference days offer keynote speeches, nearly a hundred talks, Open Space rooms for meetings and workshops, and an Expo Hall where you can meet dozens of sponsor companies and open source non-profits. More than 3,000 fans and contributors to Python are expected to attend the conference!

Another feature of PyCons as opposed to other tech conferences that must be mentioned is the diversity of speakers and attendees. For both 2014 and 2015 in Montreal, a full 33% of talks were given by women. Not only does this make for a more varied range of content and a higher degree of excellence (since the work of women programmers contributes to a greater pool of proposals from which the final talks are selected), but for a truly welcoming community. As someone who has attended the last four PyCons (the first of which, before I even became a Python user), I can tell you with absolute certainty that if you come, you will not be disappointed!

And, please, if you're working on something interesting, or care to share some insights, experiences, project development, or theoretical observations, consider proposing a talk, tutorial, or a poster session.

Registration is now open, and, if you hurry, you can qualify for the reduced cost of an Early Bird ticket. If the past is any indication, these tickets, and all remaining ones, will sell-out quickly. Don't be left out! Register today!

You can also view the announcement on the PyCon Blog, or go directly to Registration and Financial Aid.

I would love to hear from readers. Please send feedback, comments, or blog ideas to me at msushi@gnosis.cx.



November 02, 2015 08:24 PM


Dataquest

Data Scientist Interview: Benjamin Root

Overview

November 02, 2015 07:00 PM


Tryton News

New Tryton release 3.8

We are proud to announce the 3.8 release of Tryton.

For the first time the release contains sao, the new web client of Tryton. It is the result of the Indiegogo campaign. It is developed using mainly jQuery and Bootstrap and its design is responsive. It requires a recent HTML5 compatible browser. Its usage doesn't require any modification on the server side, every modules are working out of the box with sao just like they do with the GTK client. A demo is available at http://demo.tryton.org using demo/demo as login/password. This brings to 3 the number of supported client for Tryton.

A lot of work has been done to improve the accessibility of the GTK and web clients. For the GTK client, we followed the GNOME Accessibility Developers Guide as much as possible and for the web client, we followed the Web Accessibility Initiative of the W3C. You can follow further progress on this topic on the issue3459.

And of course, this release contains many bug fixes and performance improvements.

As usual, migration from previous series is fully supported.

The following screenshots are based on sao but the same feature exists also on the GTK client.

Here is a comparison of the render of sao versus tryton:

Sao sale Tryton sale

Major changes for the user

  • The client is now able to generate meaningful error messages for all kind of validation. Those error messages use the same syntax as the search filter.

    Error message
  • For a better accessibility the custom background color on widget is replaced by 'bold' label for required fields and 'italic' label for editable fields. In the same spirit, the color of rows has been removed and can be replaced by icons.

    Label bold and italic
  • A new option for fast tabbing has been added to the client. If activated, it skips read-only fields when navigating with the tab key. This was the previous default behaviour which needed to be optional to allow users with disabilities to navigate on read-only field for reading.

  • The export feature now works only on selected records but it can export a tree structure.

Accounting

  • A new report showing the cash journal amounts over a period is added. This is useful to check closing cashier.

  • The French accounting generates the FEC (Fichier des Écritures Comptables).

  • The wizard that generates payments allows to set a date instead of the default which is today.

  • The default revenue and expense accounts can be configured from accounting configuration.

    Account configuration
  • The date of statements can be corrected after they are posted.

Party

  • The language of the party depends on the company now.

  • An extensible list of identifiers replace the field VAT.

    Party identifiers

Project

The computation of the project tree has been hugely improved by grouping the computation and use of better queries.

  • There is now a progress field on the projects and tasks and of course a total which is the sum of the children.

    Project progress
  • A new method to generate invoice from project has been added which is based on the progress field.

  • It is now possible to link purchase lines to a project which will be added to the cost field.

  • The time sheet works have now a total duration field which compute the duration of the work and its children.

    Hours per work

Sale

  • The delivery date on sale line shows the effective date once the goods are delivered.

  • It is now possible to deliver the sale to another party from the one on the invoice. This is the other side of the drop shipment which makes Tryton fully support drop shipments.

    Sale shipment party
  • The drop shipment now uses two distinct moves using an temporary drop location.

Purchase

  • The delivery date on purchase line shows the effective date once the goods are received.
  • Stock moves can be cancelled from the purchase view without having to create a supplier shipment.

Stock

  • It is possible to ask Tryton to recompute the average cost price of a product by replaying all the moves since the beginning.

  • It is possible to configure another picking location different from the storage location for the warehouses.

    Warehouse picking location
  • It is possible to set an internal provisioning per location which is used for internal order point by default for all products.

Landed Cost

Those new modules allow to record landed cost on supplier shipments after their reception. A new document is created to link supplier invoice lines with shipments and to define which method to use for cost allocation. For now, there are two methods available: By Value and By Weight. And thanks to the Update Cost Price wizard, the cost price of the products can recomputed taking into account the landed cost.

Landed cost

Customs

A new module allows to define the Tarif Code from the Harmonized System and its duty rate on the products. The duty rate is stored for a country over a period and two computation types are available: a fixed amount or an amount per quantity.

Sale Complaint

This new module is for managing customer complaint about sales or invoices. Actions can be defined to solve the complaints like returning the sale or crediting the invoice. A work flow for approval of the complaint actions is set up using the access rights.

Sale Promotion

It is now possible to apply formula based promotions on sales selected thanks to some criteria. The promotion changes the unit price of the line when the sale goes into quotation (and is restored if it goes back to draft) but only if the promotion is in favor of the customer. The available criteria are: the price list, a period, the quantity and the products.

Sale Stock Quantity

This new module checks at the quotation of the sale if there are enough quantity of products in the warehouse. It also checks that the new sale will not hurt older sale that will be shipped later.

Major changes for the developer

  • The progress bar widget works with float between 0 and 1 to ease usage as percentage.
  • The rich text widget uses now a subset of HTML to allow its implementation in sao.
  • The Many2One has a new option target_search which define the kind of query to use for dereferenced search. The options are subquery and the new join (which is the default). The join method generate a faster query in most cases.
  • The SQL constraints use a similar syntax to python-sql. This gives more flexibility to implement backend for other databases.
  • Trying to create/write/delete on a Model based on a table_query raises an exception instead of a silent error.
  • The table name of a ModelSQL can be overridden with a configuration file. This allows to work around database limitations on the length of table names.
  • The new StateReport has been added to wizards, to simplify the code of wizards that run a report.
  • The style on reports has been removed, experience show that this feature was not used.
  • The PostgreSQL backend manages now schema. This allows different instances of Tryton sharing the same database.
  • The generic foreign key to create/write user on all ModelSQL has been replaced by a rule that prevent to delete users. This greatly improves scalability in some circumstances.
  • The Property field supports now float and integer values.
  • A subdirectory locale/override is supported for modules that override translations of other modules.

Accounting

  • The charts of account are no longer translatable. Instead we provide translated charts via a template using XSLT.
  • The invoice doesn't set a unit price on the line. For this feature the purchase or sale module must be used.
  • Some fields of the invoice like the note and the origin are editable after posting the invoice.

Product

  • Conversion between units no longer results in silent failures but an explicit error is risen.
  • The volume property has been added to the products.

Project

  • The tree structure of the project and the time sheet have been separated, each object has its own one.
  • The price list uses the same decimal precision as the product.
  • The cost price of the employee is stored on the time sheet line for the date of the line. This allows to sum the costs of time sheet faster

Purchase

  • The state of the purchase request is now searchable.
  • The purchase requests are generated even if the rounded quantity is zero to allow the user to still decide to purchase more.

Stock

  • Many unnecessary restrictions on the edition of move fields have been removed.
  • The expected quantity of the inventory lines is always computed even if they are added manually.
  • It is possible to create staging and draft moves using view locations. Those locations will have to be changed to really do the move.
  • The inventory uses the grouping feature to create the moves. This allows to easily support the lot (or any other extra field).

November 02, 2015 06:00 PM


Ralph Bean

Upcoming Python3 Porting vFAD

All thanks to Abdel Martínez and Matej Stuchlik, we're going to be holding a (virtual) international "Fedora Activity Day" for Python 3 porting, and it is going to be amazing. Save the date -- November 14th and 15th

https://badges.fedoraproject.org/pngs/parselmouth.png

Things to consider:

  • If you haven't heard, 2016 is going to be the year of Python3 on the desktop, so...
  • If you don't know what you're doing with Python3 porting, don't sweat it. If you want to learn, come join and we'll try to teach you along the way.
  • If you don't know how to submit patches upstream, don't sweat it. If you want to learn, come join and we'll try to teach you along the way.
  • If you want to hack with us, add your info to the wiki page. We'll be hanging out in a opentokrtc channel and in #fedora-python on freenode. See the details.
  • We have a really cool webapp that Petr Viktorin put together. It tracks the status of different packages in Fedora and upstream so we can coordinate more effectively about what needs to be done.
  • If you want to get people in your city together, that can make it more fun. You can join the video chat as a group! The EMEA crew will be online from the Pycon CZ 2015 sprints (cool). There are a couple people from my local Python User Group that want to join in.. although we're still searching for a reasonable place to meet up. I plan to be around starting at 18:00 UTC both days, although I bet EMEA crew will be online much earlier.

Happy Hacking

November 02, 2015 02:50 PM


PyCharm

Announcing General Availability of PyCharm 5

Hurray, we’ve just released PyCharm 5, the new major version of our intelligent IDE for Python, web and scientific development! It is one of the updates of our desktop products that comprise the brand new JetBrains Toolbox.

PyCharm5Prof_splash

Download PyCharm 5 for your platform today!

PyCharm 5 is available as a full-fledged Professional Edition for Python and Web development, or as a free and open-source Community Edition for pure Python and scientific development.

PyCharm 5 brings an outstanding lineup of new features, including full Python 3.5 support, Docker integration, Thread Concurrency Visualization, code insight for Django ORM methods, Conda integration, and IPython Notebook v4 support, just to name a few.

The highlights of this version include:

Python-related improvements:

IDE enhancements:

PyCharm 5 Professional Edition also brings a lot of web development enhancements including:

Please see What’s New in PyCharm 5 for more details, or watch this short What’s New in PyCharm 5 video:

Download PyCharm 5 for your platform here!

PyCharm 5 Professional Edition is a free update for you if purchased your license after November 2, 2014. As usual, a 30-day trial is available if you want to try PyCharm Professional Edition as your new Python IDE.

Develop with pleasure!
JetBrains PyCharm Team

November 02, 2015 02:45 PM


Mike Driscoll

PyDev of the Week: Craig Bruce

This week we welcome Craig Bruce (@craigbruce) as our PyDev of the Week. Let’s see what he had to say!

Can you tell us a little about yourself (hobbies, education, etc):

My background is in computational chemistry and cheminformatics. What that really means is that I’m trained to work in the small, and mainly unheard, field of early stage drug discovery, where computers are used to aid guide drug design. This work is often ten years before a drug makes it to market.

In recent years Python has become my programming language of choice from scripting to web stacks. The scientific Python stack has grown tremendously in recent years making it easier to focus on your specific research. I have little in the way of formal IT/CS education, but I’ve picked up a lot over through my education and previous roles both in programming and system administration.

My hobbies include hiking, skiing and walking my dog, I live in New Mexico where we enjoy the amazing outdoor.


Why did you start using Python?

It was a requirement for a new job, primarily to run and develop a Django powered website. In addition to the website the scientific tasks I needed to carry out used a C++ toolkit which had Python bindings, so Python became very useful for every programming aspect of this role.

What other programming languages do you know and which is your favorite?

Perl, Java and JavaScript but I’d much rather write in Python and Bash as the combination of the two allow me to deploy and run most things I need.

What projects are you working on now?

I’m working on a cloud-based product which is a drug discovery platform for use by pharmaceutical companies. It is written in Python and has a Django web app and API. My work is predominately in the backend and DevOps aspect.

Outside of work I’m one of the three co-founders for Django Events Foundation North America (DEFNA) which is running DjangoCon US so my role as Treasurer keeps me busy. DjangoCon US was earlier this month and I have been delighted by the positive feedback we have received. Preparations for 2016 has already begun – be sure to follow us on @djangocon for updates!

Which Python libraries are your favorite (core or 3rd party)?

Django continues to be a favorite because it is so versatile and mature. If I need something else I can almost guarantee that there is a 3rd party app to help as well. My current project heavily utilizes AWS, so Troposphere (https://github.com/cloudtools/troposphere) is invaluable in making CloudFormation templates easy (e.g. not writing any JSON, a pet peeve of mine).

Is there anything else you’d like to say?

Thanks for the invitation to participate in this series!

Thank you!

November 02, 2015 01:30 PM


Yasoob Khalid

Looking for guest bloggers

Hi there folks! I am very busy now-a-days. You might already be aware of that due to the long pauses between posts. Therefore, I am searching for guest bloggers who would like to write about Python, it’s frameworks or literally anything interesting and informative related to Python.

If you believe that you can do this then kindly drop me an email or post a comment below. I would love to hear from you. :)

Cheers!


November 02, 2015 12:33 PM


Ruslan Spivak

Let’s Build A Simple Interpreter. Part 6.

Today is the day :) “Why?” you might ask. The reason is that today we’re wrapping up our discussion of arithmetic expressions (well, almost) by adding parenthesized expressions to our grammar and implementing an interpreter that will be able to evaluate parenthesized expressions with arbitrarily deep nesting, like the expression 7 + 3 * (10 / (12 / (3 + 1) - 1)).

Let’s get started, shall we?

First, let’s modify the grammar to support expressions inside parentheses. As you remember from Part 5, the factor rule is used for basic units in expressions. In that article, the only basic unit we had was an integer. Today we’re adding another basic unit - a parenthesized expression. Let’s do it.

Here is our updated grammar:

The expr and the term productions are exactly the same as in Part 5 and the only change is in the factor production where the terminal LPAREN represents a left parenthesis ‘(‘, the terminal RPAREN represents a right parenthesis ‘)’, and the non-terminal expr between the parentheses refers to the expr rule.

Here is the updated syntax diagram for the factor, which now includes alternatives:

Because the grammar rules for the expr and the term haven’t changed, their syntax diagrams look the same as in Part 5:

Here is an interesting feature of our new grammar - it is recursive. If you try to derive the expression 2 * (7 + 3), you will start with the expr start symbol and eventually you will get to a point where you will recursively use the expr rule again to derive the (7 + 3) portion of the original arithmetic expression.

Let’s decompose the expression 2 * (7 + 3) according to the grammar and see how it looks:

A little aside: if you need a refresher on recursion, take a look at Daniel P. Friedman and Matthias Felleisen’s The Little Schemer book - it’s really good.

Okay, let’s get moving and translate our new updated grammar to code.

The following are the main changes to the code from the previous article:

  1. The Lexer has been modified to return two more tokens: LPAREN for a left parenthesis and RPAREN for a right parenthesis.
  2. The Interpreter‘s factor method has been slightly updated to parse parenthesized expressions in addition to integers.

Here is the complete code of a calculator that can evaluate arithmetic expressions containing integers; any number of addition, subtraction, multiplication and division operators; and parenthesized expressions with arbitrarily deep nesting:

# Token types
#
# EOF (end-of-file) token is used to indicate that
# there is no more input left for lexical analysis
INTEGER, PLUS, MINUS, MUL, DIV, LPAREN, RPAREN, EOF = (
    'INTEGER', 'PLUS', 'MINUS', 'MUL', 'DIV', '(', ')', 'EOF'
)


class Token(object):
    def __init__(self, type, value):
        self.type = type
        self.value = value

    def __str__(self):
        """String representation of the class instance.

        Examples:
            Token(INTEGER, 3)
            Token(PLUS, '+')
            Token(MUL, '*')
        """
        return 'Token({type}, {value})'.format(
            type=self.type,
            value=repr(self.value)
        )

    def __repr__(self):
        return self.__str__()


class Lexer(object):
    def __init__(self, text):
        # client string input, e.g. "4 + 2 * 3 - 6 / 2"
        self.text = text
        # self.pos is an index into self.text
        self.pos = 0
        self.current_char = self.text[self.pos]

    def error(self):
        raise Exception('Invalid character')

    def advance(self):
        """Advance the `pos` pointer and set the `current_char` variable."""
        self.pos += 1
        if self.pos > len(self.text) - 1:
            self.current_char = None  # Indicates end of input
        else:
            self.current_char = self.text[self.pos]

    def skip_whitespace(self):
        while self.current_char is not None and self.current_char.isspace():
            self.advance()

    def integer(self):
        """Return a (multidigit) integer consumed from the input."""
        result = ''
        while self.current_char is not None and self.current_char.isdigit():
            result += self.current_char
            self.advance()
        return int(result)

    def get_next_token(self):
        """Lexical analyzer (also known as scanner or tokenizer)

        This method is responsible for breaking a sentence
        apart into tokens. One token at a time.
        """
        while self.current_char is not None:

            if self.current_char.isspace():
                self.skip_whitespace()
                continue

            if self.current_char.isdigit():
                return Token(INTEGER, self.integer())

            if self.current_char == '+':
                self.advance()
                return Token(PLUS, '+')

            if self.current_char == '-':
                self.advance()
                return Token(MINUS, '-')

            if self.current_char == '*':
                self.advance()
                return Token(MUL, '*')

            if self.current_char == '/':
                self.advance()
                return Token(DIV, '/')

            if self.current_char == '(':
                self.advance()
                return Token(LPAREN, '(')

            if self.current_char == ')':
                self.advance()
                return Token(RPAREN, ')')

            self.error()

        return Token(EOF, None)


class Interpreter(object):
    def __init__(self, lexer):
        self.lexer = lexer
        # set current token to the first token taken from the input
        self.current_token = self.lexer.get_next_token()

    def error(self):
        raise Exception('Invalid syntax')

    def eat(self, token_type):
        # compare the current token type with the passed token
        # type and if they match then "eat" the current token
        # and assign the next token to the self.current_token,
        # otherwise raise an exception.
        if self.current_token.type == token_type:
            self.current_token = self.lexer.get_next_token()
        else:
            self.error()

    def factor(self):
        """factor : INTEGER | LPAREN expr RPAREN"""
        token = self.current_token
        if token.type == INTEGER:
            self.eat(INTEGER)
            return token.value
        elif token.type == LPAREN:
            self.eat(LPAREN)
            result = self.expr()
            self.eat(RPAREN)
            return result

    def term(self):
        """term : factor ((MUL | DIV) factor)*"""
        result = self.factor()

        while self.current_token.type in (MUL, DIV):
            token = self.current_token
            if token.type == MUL:
                self.eat(MUL)
                result = result * self.factor()
            elif token.type == DIV:
                self.eat(DIV)
                result = result / self.factor()

        return result

    def expr(self):
        """Arithmetic expression parser / interpreter.

        calc> 7 + 3 * (10 / (12 / (3 + 1) - 1))
        22

        expr   : term ((PLUS | MINUS) term)*
        term   : factor ((MUL | DIV) factor)*
        factor : INTEGER | LPAREN expr RPAREN
        """
        result = self.term()

        while self.current_token.type in (PLUS, MINUS):
            token = self.current_token
            if token.type == PLUS:
                self.eat(PLUS)
                result = result + self.term()
            elif token.type == MINUS:
                self.eat(MINUS)
                result = result - self.term()

        return result


def main():
    while True:
        try:
            # To run under Python3 replace 'raw_input' call
            # with 'input'
            text = raw_input('calc> ')
        except EOFError:
            break
        if not text:
            continue
        lexer = Lexer(text)
        interpreter = Interpreter(lexer)
        result = interpreter.expr()
        print(result)


if __name__ == '__main__':
    main()

Save the above code into the calc6.py file, try it out and see for yourself that your new interpreter properly evaluates arithmetic expressions that have different operators and parentheses.

Here is a sample session:

$ python calc6.py
calc> 3
3
calc> 2 + 7 * 4
30
calc> 7 - 8 / 4
5
calc> 14 + 2 * 3 - 6 / 2
17
calc> 7 + 3 * (10 / (12 / (3 + 1) - 1))
22
calc> 7 + 3 * (10 / (12 / (3 + 1) - 1)) / (2 + 3) - 5 - 3 + (8)
10
calc> 7 + (((3 + 2)))
12


And here is a new exercise for you for today:


Hey, you read all the way to the end! Congratulations, you’ve just learned how to create (and if you’ve done the exercise - you’ve actually written) a basic recursive-descent parser / interpreter that can evaluate pretty complex arithmetic expressions.

In the next article I will talk in a lot more detail about recursive-descent parsers. I will also introduce an important and widely used data structure in interpreter and compiler construction that we’ll use throughout the series.

Stay tuned and see you soon. Until then, keep working on your interpreter and most importantly: have fun and enjoy the process!


Here is a list of books I recommend that will help you in your study of interpreters and compilers:

  1. Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages (Pragmatic Programmers)

  2. Writing Compilers and Interpreters: A Software Engineering Approach

  3. Modern Compiler Implementation in Java

  4. Modern Compiler Design

  5. Compilers: Principles, Techniques, and Tools (2nd Edition)


By the way, I’m writing a book “Let’s Build A Web Server: First Steps” that explains how to write a basic web server from scratch. You can get a feel for the book here, here, and here. Subscribe to the mailing list to get the latest updates about the book and the release date.

OPTIN_FORM_PLACEHOLDER


All articles in this series:

November 02, 2015 10:44 AM

November 01, 2015


David MacIver

Let Hypothesis making your choices for you

I had a moment of weakness this morning and did some feature development on Hypothesis despite promising not to. The result is Hypothesis 1.14.0.

This adds a bunch of interesting new strategies to the list. One I’d like to talk about in particular is the new choices strategy.

What does it do?ell, it gives you something that behaves like random.choice only under Hypothesis’s control and subject to minimization. This more or less solves the problem I had a long and complicated post about a while ago for picking elements from a list. You can now do something like:

from hypothesis import given, strategies as st
 
 
@given(st.lists(st.integers(), min_size=1), st.choices())
def test_deletion(values, choice):
    v = choice(values)
    values.remove(v)
    assert v not in values

Then running this will print something like:

_____________________________________________ test_deletion ______________________________________________
test_del.py:4: in test_deletion
    def test_deletion(values, choice):
src/hypothesis/core.py:583: in wrapped_test
    print_example=True, is_final=True
src/hypothesis/executors/executors.py:25: in default_executor
    return function()
src/hypothesis/core.py:365: in run
    return test(*args, **kwargs)
test_del.py:7: in test_deletion
    assert v not in values
E   assert 0 not in [0]
----------------------------------------------- Hypothesis -----------------------------------------------
Falsifying example: test_deletion(values=[0, 0], choice=choice)
Choice #1: 0
===================

Note that the choices are printed as they are made. This was one of the major obstacles to implementing something like this in the past: The lack of the ability to display the results from within the test. The new note API offers a solution to this.

November 01, 2015 03:49 PM


Jorgen Schäfer

Elpy 1.10.0 released

I just released version 1.10.0 of Elpy, the Emacs Python Development Environment. This is a feature release.

Elpy is an Emacs package to bring powerful Python editing to Emacs. It combines a number of other packages, both written in Emacs Lisp as well as Python.

Quick Installation

Evaluate this:

(require 'package)
(add-to-list 'package-archives
'("elpy" .
"https://jorgenschaefer.github.io/packages/"))

Then run M-x package-install RET elpy RET.

Finally, run the following (and add them to your .emacs):

(package-initialize)
(elpy-enable)

Changes in 1.10.0

Thanks to ChillarAnand and Georg Brandl for their contributions!

November 01, 2015 01:06 PM


Vasudev Ram

data_dump, a Python tool like Unix od (octal dump)

By Vasudev Ram




The Unix od command, which stands for octal dump, should be known to regular Unix users. Though the name includes the word octal (for historical reasons) [1], it supports other numeric systems as well; see below.

[1] See:

The Wikipedia page for od, which says that "od is one of the earliest Unix programs, having appeared in version 1 AT&T Unix."

od is a handy tool. It dumps the contents of a file (or standard input) to standard output, in "unambiguous" ways, such as the ability to show the file contents as numeric values (ASCII codes), interpreted as bytes / two-byte words / etc. It can do this in octal, decimal, binary or hexadecimal format. It can also show the content as characters. But the Unix cat command does that already, so the od command is more often used to show characters along with their numeric codes. It also shows the byte offset (from the start of the file) of every, say, 10th character in the file, in the left column of its output, so the user can keep track of where any content occurs in the file.

All this is useful because it allows Unix users (programmers and system administrators as well as end users) to inspect the contents of files in different ways (hex, binary, character, etc.). The files thus inspected could be text files or binary files of any kind. Often, programmers use the output of od to debug their application, by viewing a file that their program is either reading from or writing to, to verify that it contains what they expect, or to find that it contains something that they do not expect - which could be due either to invalid input or to a bug in their program causing incorrect output.

I needed to use od recently. Doing so made me think of writing a simple version of it in Python, for fun and practice. So I did it. I named it data_dump.py. Here is the code for it:

'''
Program name: data_dump.py
Author: Vasudev Ram.
Copyright 2015 Vasudev Ram.
Purpose: To dump the contents of a specified file or standard input,
to the standard output, in one or more formats, such as:
- as characters
- as decimal numbers
- as hexadecimal numbers
- as octal numbers

Inspired by the od (octal dump) command of Unix, and intended to work,
very roughly, like it. Will not attempt to replicate od exactly or even
closely. May diverge from od's way of doing things, as desired.
'''

# Imports:

from __future__ import print_function
import sys

# Global constants:

# Maximum number of character (from the input) to output per line.
MAX_CHARS_PER_LINE = 16

# Global variables:

# Functions:

def data_dump(infil, line_len=MAX_CHARS_PER_LINE, options=None):
'''
Dumps the data from the input source infil to the standard output.
'''
byte_addr = 0
buf = infil.read(line_len)
# While not EOF.
while buf != '':
# Print the offset of the first character to be output on this line.
# The offset refers to the offset of that character in the input,
# not in the output. The offset is 0-based.
sys.stdout.write("{:>08s}: ".format(str(byte_addr)))

# Print buf in character form, with . for control characters.
# TODO: Change to use \n for line feed, \t for tab, etc., for
# those control characters which have unambiguous C escape
# sequences.
byte_addr += len(buf)
for c in buf:
sys.stdout.write(' ') # Left padding before c as char.
if (0 = ord(c) = 31) or (c == 127):
sys.stdout.write('.')
else:
sys.stdout.write(c)
sys.stdout.write('\n')

# Now print buf in hex form.
sys.stdout.write(' ' * 10) # Padding to match that of byte_addr above.
for c in buf:
sys.stdout.write(' ') # Left padding before c in hex.
sys.stdout.write('{:>02s}'.format((hex(ord(c))[2:].upper())))
sys.stdout.write('\n')
buf = infil.read(line_len)
infil.close()


def main():
'''
Checks the arguments, sets option flags, sets input source.
Then calls data_dump() function with the input source and options.
'''
try:
lsa = len(sys.argv)
if lsa == 1:
# Input from standard input.
infil = sys.stdin
elif lsa == 2:
# Input from a file.
infil = open(sys.argv[1], "rb")
data_dump(infil)
sys.exit(0)
except IOError as ioe:
print("Error: IOError: " + str(ioe))
sys.exit(1)

if __name__ == '__main__':
main()

And here is the output of a sample run, on a small text file:
$ data_dump.py t3
00000000: T h e q u i c k b r o w n
54 68 65 20 71 75 69 63 6B 20 62 72 6F 77 6E 20
00000016: f o x j u m p e d o v e r
66 6F 78 20 6A 75 6D 70 65 64 20 6F 76 65 72 20
00000032: t h e l a z y d o g . . . T
74 68 65 20 6C 61 7A 79 20 64 6F 67 2E 0D 0A 54
00000048: h e q u i c k b r o w n f
68 65 20 71 75 69 63 6B 20 62 72 6F 77 6E 20 66
00000064: o x j u m p e d o v e r t
6F 78 20 6A 75 6D 70 65 64 20 6F 76 65 72 20 74
00000080: h e l a z y d o g . . . T h
68 65 20 6C 61 7A 79 20 64 6F 67 2E 0D 0A 54 68
00000096: e q u i c k b r o w n f o
65 20 71 75 69 63 6B 20 62 72 6F 77 6E 20 66 6F
00000112: x j u m p e d o v e r t h
78 20 6A 75 6D 70 65 64 20 6F 76 65 72 20 74 68
00000128: e l a z y d o g .
65 20 6C 61 7A 79 20 64 6F 67 2E

$
Note that I currently replace control / non-printable characters by a dot, in the output. Another option could be to replace (at least some of) them with C escape sequences, such as \r (carriage return, ASCII 13), \n (line feed, ASCII 10), etc. That is the way the original od does it.

In a future post, I'll make some improvements, and also show and discuss some interesting and possibly anomalous results that I got when testing data_dump.py with different inputs.

Happy dumping! :)


Details of the above image are available here:

Truck image credits

- Vasudev Ram - Online Python training and programming

Signup to hear about new products and services I create.

Posts about Python  Posts about xtopdf

My ActiveState recipes


November 01, 2015 03:10 AM

October 31, 2015


بايثون العربي

تفعيل الإكمال التلقائي في شيل بايثون

إنه من الجيد أن تتلاعب مع بايثون وتكتشف الكثير من الأمور الجديدة والتي قد تساعدك في المستقبل ، ومن أهم المميزات التي تاتي مع بايثون هي الشيل الخاص بهذه الأخيرة حيث يمكننا تجريب كل مانريده ، دعونا نقول أنك قمت بفتح شيل بايثون وقمت بتعيين مجموعة من المتغيرات وتريد ان تعمل عليها وطبعا لا تريد أن تقوم بإعادة كتابة أسماء تلك المتغيرات في كل مرة ، كما أنه ليس من السهل أن نحفظ جميع أسماء دوال و الوحدات المدمجة مع بايثون ، وطالما أنك تستخدم نفس المتغبرات والدوال في كل مرة فإنه من الجيد ان تقوم بكتابة حرف أو حرفين ويقوم شيل بايثون بإكمال بقية إسم المتغير (نفس الخاصية الموجودة في راوترات سيسكو ) بمجرد الضغط على زر TAP الموجود على لوحة المفاتيح ولحسن الحظ يوفر بايثون هذه الميزة فدعونا نكتشف كيفية تفعبلها .

نقوم بإنشاء ملف تحت إسم “pythonrc.” في الدليل الرئيسي من خلال الأمر التالي :

$ vim ~/.pythonrc

ثم نقوم بإضافة السطرين التاليين في ذلك الملف:

import rlcompleter, readline
readline.parse_and_bind("tab: complete")

قم بحفظ وغلق الملف ، الأن نقوم بفتح ملف“bashrc/~” لإضافة السطر التالي في أخر الملف

export PYTHONSTARTUP="~/.pythonrc"

قم بحفظ وغلق الملف، بعد ذلك نقوم بإعادة تشغيل الملفين التالين حتى يتعرفا على التحديث الذي قمنا به :

$ source ~/.profile
$ source ~/.bashrc

إلى هنا نكون قد إنتهينا من عملنا وحان وقت جني الفوائد ، قم بفتح شيل بايثون بكتابة python على الطرفية وقم بتعيين متغير وعلى سبيل المثال :

>>> my_variable = "Hello world"
>>> my

في السطر الثاني إذا قمت بكتابة my وضغطت على زر TAP سيقوم الشيل بإكمال إسم المتغير تلقائيا إلى my_variable .

وإذا قمنا بإستدعاء أي وحدة من وحدات بايثون يمكننا رؤية جميع دوال تلك الوحدة بمجرد كتابة إسم الوحدة مع نقطة ثم بالضغط على زر TAP

>>> import os
>>> os.

Display all 234 possibilities? (y or n)

October 31, 2015 10:12 PM


Daily Tech Video (Python)

[Video 341] Ned Batchelder: Facts and Myths about Python names and values

Programmers who come to Python from other languages are often surprised to discover how variables work. Things seem even odder when you deal with mutable and immutable values, and see the differences between their behaviors. In this talk, Ned Batchelder reviews how assignment works in Python, when names are connected to values, and how Python’s consistent rules can result in some confusion — until you understand what’s going on, that is.

The post [Video 341] Ned Batchelder: Facts and Myths about Python names and values appeared first on Daily Tech Video.

October 31, 2015 09:05 PM


pgcli

Release v0.20.0

Pgcli is a command line interface for Postgres database that does auto-completion and syntax highlighting. You can install this version by:

$ pip install -U pgcli

Check detailed instructions if you're having difficulty.

This version adds support for \\h command and \x auto command. Two very important commands that bring us close to being a total replacement for psql. This version can also handle really large databases without trouble thanks to asynchronous completion refresh. Full details below:

Features:

  • Perform auto-completion refresh in background. (Thanks: Amjith, Darik Gamble, Iryna Cherniavska).

    When the auto-completion entries are refreshed, the update now happens in a background thread. This means large databases with thousands of tables are handled without blocking.

  • Add support for \h command. (Thanks: Stuart Quin).

    This is a huge deal. Users can now get help on an SQL command by typing: \h COMMAND_NAME in the pgcli prompt.

  • Add support for \x auto. (Thanks: Stuart Quin).

    \\x auto will automatically switch to expanded mode if the output is wider than the display window.

  • Don't hide functions from pg_catalog. (Thanks: Darik Gamble).

  • Suggest set-returning functions as tables. (Thanks: Darik Gamble).

    Functions that return table like results will now be suggested in places of tables.

  • Suggest fields from functions used as tables. (Thanks: Darik Gamble).

  • Using pgspecial as a separate module. (Thanks: Iryna Cherniavska).

  • Make "enter" key behave as "tab" key when the completion menu is displayed. (Thanks: Matheus Rosa).

  • Support different error-handling options when running multiple queries. (Thanks: Darik Gamble).

    When on_error = STOP in the config file, pgcli will abort execution if one of the queries results in an error.

  • Hide the password displayed in the process name in ps. (Thanks: Stuart Quin)

  • Add CONCURRENTLY to keyword completion. (Thanks: Johannes Hoff).

Bug Fixes:

  • Fix the ordering bug in \d+ display, this bug was displaying the wrong table name in the reference. (Thanks: Tamas Boros).
  • Only show expanded layout if valid list of headers provided. (Thanks: Stuart Quin).
  • Fix suggestions in compound join clauses. (Thanks: Darik Gamble).
  • Fix completion refresh in multiple query scenario. (Thanks: Darik Gamble).
  • Fix the broken timing information.
  • Fix the removal of whitespaces in the output. (Thanks: Jacek Wielemborek)
  • Fix PyPI badge. (Thanks: Artur Dryomov).

Improvements:

  • Move config file to ~/.config/pgcli/config instead of ~/.pgclirc (Thanks: inkn).
  • Move literal definitions to standalone JSON files. (Thanks: Darik Gamble).

Internal Changes:

  • Improvements to integration tests to make it more robust. (Thanks: Iryna Cherniavska).

October 31, 2015 07:00 AM


Podcast.__init__

Episode 29 - Anthony Scopatz on Xonsh

Visit our site to listen to past episodes, support the show, and sign up for our mailing list.

Summary

Anthony Scopatz is the creator of the Python shell Xonsh in addition to his work as a professor of nuclear physics. In this episode we talked to him about why he created Xonsh, how it works, and what his goals are for the project. It is definitely worth trying out Xonsh as it greatly simplifies the day-to-day use of your terminal environment by adding easily accessible python interoperability.

Brief Introduction

Hired Logo

On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job.

Linode Sponsor Banner

Use the promo code podcastinit10 to get a $10 credit when you sign up!

Interview with Anthony Scopatz

Picks

Keep In Touch

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary Anthony Scopatz is the creator of the Python shell Xonsh in addition to his work as a professor of nuclear physics. In this episode we talked to him about why he created Xonsh, how it works, and what his goals are for the project. It is definitely worth trying out Xonsh as it greatly simplifies the day-to-day use of your terminal environment by adding easily accessible python interoperability. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Linode is also sponsoring us this week. Check them out at linode.com/podcastinit and get a $10 credit to try out their fast and reliable Linux virtual servers for your next project We are recording today on October 12th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Anthony Scopatz about Xonsh On Hired software engineers designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Use the promo code podcastinit10 to get a $10 credit when you sign up! Interview with Anthony Scopatz Introductions How did you get introduced to Python? Can you explain what Xonsh is and your motivation for creating it? For people transitioning to Xonsh from a shell like Bash or Zsh, what are some of the biggest differences that they will see? What are some really powerful one-liners that showcase Xonsh's capabilities? What is it about Python that lends itself to this kind of a project and what are your thoughts on building something like Xonsh in another language such as Ruby or Node.js? If you had to single out one killer feature that Xonsh brings to the table, what would that be? Is it possible to specify which shell, such as bash or zsh, gets used in subprocess mode? I started using the Xonsh shell as my daily terminal recently and have been enjoying it so far. One of the things that I have been wondering is how to hook into the completion system to provide eldoc style completion from parsing the output of help flags. Do you have any advice on where to start? Perhaps using the docopt library to handle parsing of help output and generate completions from that? What are your thoughts on adding a section to the project documentation for people to list various extension modules that people can take advantage of? Or perhaps creating something along the lines of Oh my Xonsh? How do bash function definitions interoperate with the Xonsh environment and functions defined in Python? It seems as though there could be some potential path or compatibility issues when moving between virtual environments and having access to extension modules loaded into Xonsh. Can you shed some light on that? Do you have any suggestions

October 31, 2015 02:06 AM

October 30, 2015


Tomasz Ducin

travis python and phpunit example

Travis is a cool continuous intergration environment which enables you to test your projects. Many github users configure their projects with travis to provide instant testing (your results are available ~1 minute after you pushed something to your repo. While it's quite easy to set up some tests on a specific platform (e.g. both implementation and tests written in, say, python), it doesn't have to be that easy when you use multiple platforms.

I was developing a python server which I wanted to test on different platforms, one of them was PHP (with PHPUnit being the most standard testing tool). The standard travis configuration should set the language property stating the main platform and this was python in my case (because the server is the main component):


language: python
python:
- "2.6"
- "2.7"
This gives me standard python tools already available (without need to install it manually). As the travis community states, you may configure multiple platforms for testing, but you don't have default tools set up for all of them. PHPUnit is not available here. But it'd be, if PHP was the main language. I had to install it myself. Running sudo apt-get install phpunit by travis seems natural to do that, but it's not the way to go, because of well known phpunit install issue (mismatched dependencies). As the stackoverflow answer states, you shall install phpunit through pear by runing sequence of several sudo-ed commands.

Following is an example of minimum configuration that finally worked for my python project:


language: python
python:
- "2.6"
- "2.7"
before_install:
- sudo apt-get update
- sudo apt-get install python-twisted php-pear
- sudo pear channel-discover pear.phpunit.de
- sudo pear channel-discover pear.symfony-project.com
- sudo pear channel-discover components.ez.no
- sudo pear update-channels
- sudo pear upgrade-all
- sudo pear install --alldeps phpunit/PHPUnit
- sudo pear install --force --alldeps phpunit/PHPUnit
install:
- pip install -r requirements.txt --use-mirrors
before_script: ./server.py 12345 &
after_script: pkill -9 -f "server.py"
script:
- cd client/php && phpunit && cd ../..
- nosetests --nocapture
php (the language) is installed by default, but pear (php-pear package) is not, neither is phpunit. This links to an example build that worked with above configuration.

October 30, 2015 10:35 PM

python os.fork parent child PIDs

Ever wondered how to resolve forked child process ID? The following code snippet make things pretty clear:

Loading gist https://gist.github.com/ducin/7289863....

You need to know two facts:

Basing on above facts we can analyse output of the script:

[parent] starts PID: 17420
[parent] parent process have created child with PID: 17421
[child] child process can't use os.fork() PID, since it's 0
[child] but it can reevaluate os.getpid() to get it's own PID: 17421
The evaluation goes like this:

October 30, 2015 10:34 PM

python open interactive console

In this article I'll show a small code snippet that simulates a breakpoint without using any IDE (Integrated Development Environment). This is similar to firebug's / chrome developer tools' javascript console, where you may run your custom commands (typed in realtime) while being enclosed in the brakpoint's scope. This is very useful when dealing with big/undocumented/legacy code and you want to check the state of variables.

All this code does is copying local/global variables, setting the console autocompletion and starting the interactive shell, where you, the developer, can look at the python runtime environment. The following code presents the console.py module with copen method and test.py file which demonstrates the console usage:

Loading gist https://gist.github.com/ducin/6882621...

Fetch the repository and run the test.py file:

git clone git@github.com:6882621.git py_console && cd py_console && python test.py
type dir() to check current scope content and see example_list and example_tuple. After closing the console, the script will continue where it stopped (see the print statement):

remote: Counting objects: 14, done.
remote: Compressing objects: 100% (12/12), done.
remote: Total 14 (delta 3), reused 8 (delta 2)
Receiving objects: 100% (14/14), done.
Resolving deltas: 100% (3/3), done.
Python 2.7.2+ (default, Jul 20 2012, 22:12:53)
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> dir()
['__builtins__', '__doc__', '__file__', '__name__', '__package__', 'console', 'example_list', 'example_tuple']
>>> example_list
[1, 2, 3]
>>> example_tuple
('abc', 'def')
>>> # hit ctrl+D to quit
>>>
this will be continued
$

October 30, 2015 10:33 PM


PyCharm

PyCharm 5 RC2 is available

With just a few days before the official release of PyCharm 5, today we’ve published the PyCharm 5 RC2 build, which is already available for download and evaluation from the EAP page.

The full list of fixes and improvements for this build can be found in the release notes.

Download PyСharm 5 RC2 build for your platform and please report any bugs and feature request to our Issue Tracker. It also will be available shortly as a patch update from within the IDE (from previous EAP builds only) for those who selected the EAP or Beta Releases channels in the update settings.

Stay tuned for a PyCharm 5 release announcement, follow us on twitter, and develop with pleasure!

-PyCharm Team

October 30, 2015 07:32 PM


Nikola

Nikola v7.7.3 is out!

On behalf of the Nikola team, I am pleased to announce the immediate availability of Nikola v7.7.3. It fixes some bugs and adds new features.

What is Nikola?

Nikola is a static site and blog generator, written in Python. It can use Mako and Jinja2 templates, and input in many popular markup formats, such as reStructuredText and Markdown — and can even turn Jupyter (IPython) Notebooks into blog posts! It also supports image galleries, and is multilingual. Nikola is flexible, and page builds are extremely fast, courtesy of doit (which is rebuilding only what has been changed).

Find out more at the website: https://getnikola.com/

Downloads

Install using pip install Nikola or download tarballs on GitHub and PyPI.

Changes

Features

  • Recommend ghp-import2 (better packaging) (Issue #2152)
  • New TAGS_INDEX_PATH option for overwriting the path of the tag index list page.
  • Support for ~~strikethrough~~ in Markdown (Issue #2149)
  • Hungarian translation (by Baptiste Darthenay)
  • serve and auto publishes DNS Service Discovery records to the local network announcing they’re running web servers.

Bugfixes

  • Implement translating DATE_FORMAT properly
  • Remove superfluous translatability for a boolean (Issue #2140)
  • Pass messages to post-list template (Issue #2156)
  • Changed default log level from INFO to NOTICE (nikola check is less chatty)
  • Fix support for panorama images in gallery(Issue #2143)
  • Support "maps.world.World" and similar charts in Pygal (Issue #2142)
  • Capitalize "UTF-8" properly in locale guessing (Issue #2137)

October 30, 2015 04:09 PM


Obey the Testing Goat

[OT] Autumn Leaves

To my usual readers that come hear for TDD, apologies, this is off-topic, normal service will resume shortly.

I was in an autumnal mood yesterday, and found myself listening to 3 different versions of the jazz standard "Autumn Leaves", and since it's that sort of season, I thought I'd share them with you.

I realised I was surprisingly affected by the song, because actually I've been listening to it, in different forms, for much longer than I thought. Here are the three that stand out, in reverse chronological order of my life encounters with them.

The first version is Eva Cassidy's. The fact that I'm listening to folk singers is not something I would ever have guessed when I was twenty, call it my wife's bad influence, but I think this version is incredibly beautiful, almost perfect.

(if you like the sounds of that, be sure to check out Eva Cassidy's rendition of "somewhere over the rainbow", which is absolutely heartbreaking, particularly if you know how she died)

That was the version of the song I encountered in my 30s. Here's the version from my 20s:

Quite different! The original Coldcut mix is well worth checking out too.

I was always surprised by how affecting I found this song, whether it was in my increasingly-soppy, middle-aged 30s, or in my considerably cooler 20s, and searching around yesterday I realised why.

Autumn Leaves has been around for ages, but it's actually originally a French song, the canonical rendition of which is by classic French crooner Yves Montand, and this is the one I remember from my youth.

Now we're back to my childhood in France, and apologies to non-French speakers, I'm not sure if this will connect with y'all, but I find this one almost unbearable. It's actually based on a poem by Jacques Prevert, one famous enough that we studied it in school.

In French, it's not "Autumn Leaves" though, it's "Dead Leaves" -- "Les Feuilles Mortes". Already that sets a different tone, but it's the final stanza that I think is so devastating:

Mais la vie sépare ceux qui s'aiment,

Tout doucement, sans faire de bruit.

Et la mer efface sur le sable,

Les pas des amants désunis.

Here's my rough translation:

But life separates those who love each other,

Gently, quietly,

And the sea washes away

the sandy footprints of lovers, now disunited.

From French lessons you might remember you have to do an elision between the "des" and the "amants", making it "desamants", anticipating the "desunis", making it crushingly inevitable. I think it's about the saddest thing in the world.

Enjoy!

P.S. Everything's fine between me and my wife by the way! Don't worry. Just inspired by the season...

Image credit: kat squirrelmuffins

October 30, 2015 02:29 PM