Planet Python

Last update: March 13, 2015 09:47 PM

March 13, 2015

PyCharm

Feature Spotlight: VCS integration in PyCharm

Happy Friday everyone,

Today we’ll take a look at some of the basic VCS features in PyCharm that can help manage different version control systems.

You may already know that PyCharm has seamless integration with major version controls like Git, GitHub, Subversion, Mercurial, Perforce (available only in PyCharm Professional Edition), and CVS. Even though all these version controls have different models and command sets, PyCharm makes life a lot easier by advocating a VCS-agnostic approach for managing them wherever possible.

So here we go:

Checking out a project from a VCS

To import a project from a version control system, click the Check out from Version Control button on the Welcome screen, or use the same VCS command from the main menu:

Version Control settings

A project’s version control settings are accessed via Settings → Version Control. You can associate any of the project folders with a repository root. These associations can be removed at any time, or you can even opt to disable the version control integration entirely:

PyCharm can handle multiple VCS repositories assigned to different folders of the project hierarchy, and perform all VCS operations on them in uniform manner.

Changes tool window and changelists

After version control is enabled for a project, you can see and manage your local changes via the Changes tool window. To quickly access the tool window, press Alt + 9 (Cmd-9 on a Mac):

All changes are organized into changelists that can be created, removed, and made active.

Quick list of VCS operations

When you need to perform a VCS operation on a currently selected file, directory, or even on the entire project, bring up the VCS operations quick-list via Alt+Back Quote (Ctrl-V on a Mac):

Show History

The history of changes is available for a set of files or directories via the VCS operations quick-list, or in the main menu VCS →<version control name> → Show History, or in the context menu → Show History:

To see all changes for a specific code snippet, use the Show History for Selection action.

Annotations

Annotations are available from the quick-list, the main menu or the context menu. They allow you to see who changed a certain line of code and when:

When you click the annotation, you will see the detailed information about the corresponding commit.

Useful shortcuts

Commit current changelist Ctrl+K (Cmd-K on a Mac)
Update the project Ctrl+T (Cmd-T on a Mac)
Mark selected files and folders as added Ctrl+Alt+A (Alt-Cmd-A on a Mac)
Mark selected files and folders as changed (checked out) via Ctrl+Alt+E (Alt-Cmd-E on a Mac)
Show diff (available in the Changes tool window) via Ctrl+D (Cmd-D on a Mac)
Move changes to another change list (available in the Changes tool window) via F6
Push commits to remote repositories via Ctrl+Shift+K (Cmd-Shift-K on a Mac)

Commit options

When committing changes, PyCharm lets you perform a variety of operations:

change the file set to commit to,
join the changes with the previous commit by using the Amend commit option,
reformat the changed code,
optimize imports,
ensure that there are no inspection warnings,
update the copyright information,
or even upload the changes to a remote FTP server.

Ignored files

To configure the ignored files, go to Settings → Version Control, or use the corresponding button in the Changes tool window:

The actual list of ignored files can be displayed in the Changes tool window next to the changelists by clicking the corresponding button.

Branches

With PyCharm you can easily create, switch, merge, compare and delete branches (available for Git and Mercurial only). To see a list of existing branches or create a new one, use either the Branches from the main or context menu, or the VCS operations quick-list, or the widget on the right-hand side of the status bar:

For multiple repositories, PyCharm performs all VCS operations on all branches simultaneously, so you don’t need to switch between them manually.

Shelves, stashes, and patches

Shelves and Stashes help you when you need to put away some local changes without committing them to repository, then switch to the repository version of the files, and then come back to your changes later. The difference between them is that Shelves are handled by PyCharm itself and are stored in the local file system, while Stashes are kept in a VCS repository.

Patches allow you to save a set of changes to a file that can be transferred via email or file sharing and then applied to the code. They are helpful when you’re working remotely without having a constant connection to your VCS repository and still need to contribute:

Log

To see the entire list of commits in a repository, sorted and filtered by branch, user, date, folder, or even a phrase in description, use the Log tab in the Changes tool window. This is the easiest way to find a particular commit, or to just browse through the history:

In this blog post we touched just a tip of the VCS integration iceberg. Go ahead and try this functionality in action! Here’s a tutorial that can walk you through the VCS integration features and provide additional information. And if after that you’re still craving yet more details, please see our online help.

That’s it for today. See you next week!
-Dmitry

March 13, 2015 08:37 PM

Python Software Foundation

Unicef Pi4Learning

I previously posted about a wonderful education program utilizing Raspberry Pis (AstroPi). Here’s another one:

Since last May, Unicef has been using Raspberry Pis to educate Syrian children who have been displaced into Lebanon due to their country’s civil war. The program, called Pi4Learning was developed by James Cranwell-Ward, UNICEF Lebanon Innovation Lead, and Eliane Metni of the International Education Association.

With approximately 300,000 Syrian school children living as refugees in Lebanon with no educational resources, Unicef’s Cranwell-Ward sought an inexpensive, ready-to-go solution that could be implemented in refugee camp environments. Already a Raspberry Pi enthusiast, he paired the device with Alex Eames' KickStarer funded HDMIPi screens. Working with Eliane Metni, who had been piloting Raspberry Pis at Dhour El Shweur Public Secondary School in Lebanon, they obtained free Arabic language curriculum from Khan Academy and began providing free classes to the Syrian children.

The Pi4L program is divided into learning tracks: Core Skills Modules for ages 6 – 12 (literacy, numeracy, and science, using Khan Academy content); Technology Applications for ages 5 – 18 (Learning to Code and Coding to Learn); and Continuing Education and Certification for Teachers.

Each complete computer system costs around $100 and the Khan Academy content is stored and can be delivered offline. Currently approximately 30,000 refugees are using the program, and the goal is to continue to expand.

Both Cranwell-Ward and Metni are especially excited that the program teaches kids to code and to become creative participants in an increasingly technological world community. According to Cranwell-Ward, “The rate at which tech is being rolled out into our lives is phenomenal and coding - or the understanding of technology and how to manipulate it - is going to be a core component of our lives and our children’s lives moving forward… . “There needs to be some basic understanding of what technology is, how it can be manipulated, how we can use it to help ourselves, and not just be a consumer or slave,” quoted from the The Guardian.

One of the students is 11-year-old Zeinab Al Jusuf. There is a video about her experiences and the Unicef project at Unicef stories.

There is also a wealth of information online about this project, so if you’re at all interested I urge you to read more. For an excellent overview by Unicef’s Luciano Calestini, see Innovation.

I would love to hear from readers. Please send feedback, comments, or blog ideas to me at msushi@gnosis.cx.

March 13, 2015 04:02 PM

Membership Vote

This morning, PSF Director David Mertz announced on the PSF Members' mailing list the opening of a vote. For those of you who have already self-certified as voting members, or if you are already a Fellow of the Foundation, you should have received the announcement in a private email.

This is our first stab at using the voting mechanism to get a sense of the larger membership's views on an issue currently under discussion (the non-binding poll), so we urge you to take a moment and make your voice heard.

To review your eligibility to vote and to see the certification form, please see my previous blog post Enroll as Voting Member or go to the PSF Website PSF Website.

Here is the announcement:

Membership Vote for Pending Sponsors and Non-Binding Poll

The candidate Sponsor Members listed below were recommended for approval by the Python Software Foundation Board of Directors. Following the ballot choices is a detailed description of the organization (the submit button is after the descriptions, so scroll down for it).
This election will close on 2015-03-26.
Sponsor Member Candidates

Bloomberg LP yes no abstain

Fastly yes no abstain

Infinite Code yes no abstain

Non-Binding Poll on PyCon Video Sublicensing

Purpose: The PSF Board of Directors is seeking the collective perspective of PSF Voting Members on the appropriate handling of video recording sublicensing for presentations at PyCon US. These videos are currently made freely available on Google's YouTube, and may be incorporated into other sites through YouTube's embedding features. There are no plans to change that arrangement, but a separate question has arisen that requires determining whether it would be appropriate to exercise the sublicensing rights granted to the PSF under the PyCon US speaker agreement. This part of the poll serves as a non-binding survey of PSF Voting Members, intended to help the Directors formulate a suitable policy in this area based on the way the PyCon US speaker agreement is generally perceived, rather than based solely on what it permits as a matter of law.
Background: A request has been made to the PSF to sublicense video recordings made at PyCon of speaker presentations. The license agreement signed by speakers gives the PSF the right to grant such sublicenses, however the Board of Directors is of mixed opinion about whether we should do so. The release form (i.e. license) agreed to by speakers is at https://us.pycon.org/2015/speaking/recording/ for reference. Note that YouTube is explicitly mentioned in the release as an example of such a sublicensee, and pyvideo.org has always been given this right (although they have only exercised it thus far by embedding YouTube hosted videos, not by mirroring content, and hence are not technically a sublicensee at this point). Embedding a video does not require a sublicense, only mirroring it does.
There are two axes along which the Board is divided. On the one hand, we are not unanimous about whether we should grant a sublicense to commercial entities which may benefit financially by providing local copies of these video recordings, and may even potentially grant such local access only to subscribers in some manner. In favor of granting such access, some Directors feel that the more widespread the mirroring, the better, regardless of the commercial or non-commercial nature of the hosting (i.e. as long as the gratis access is never removed, which is not being contemplated). In opposition to granting such access, some Directors feel that for-profit sublicensees will gain unfair commercial advantage by bundling PyCon videos with other content sold for profit. Potentially the PSF may require payment, and gain revenue, for granting these sublicense rights.
On the other hand, we are also not unanimous about whether—if we do grant sublicenses—we should do so only prospectively, once we can inform speakers of our intent prior to their talks, or whether we should exercise the rights given in speaker releases even retroactively for previous PyCons. While speakers have given such rights already in a legal sense, some Directors feel they may not have fully contemplated that grant at the time, and only going forward, with more explicit information about sublicensing intents of the PSF, should sublicensing be allowed to other entities.

Sublicense entities Only YouTube (others embedding) As many mirrors as possible Only non-commercial mirrors

Sublicense timeframe Prospectively only Including retroactively Not applicable

Bloomberg LP
As the market data and analysis industry leader, Bloomberg LP provides a broad portfolio of innovations to our clients. Bloomberg's Open Market Data Initiative is part of our ongoing efforts to foster open solutions for the the financial services industry. This includes a set of published Python modules that are freely available to our clients at http://www.bloomberglabs.com/api/libraries/. In support of promoting further Python usage within the financial services industry, we have hosted a number of free public developer-focused events to support the Python ecosystem—including the Scientific Python community. Please refer to http://go.bloomberg.com/promo/invite/bloomberg-open-source-day-scientific-python/ and https://twitter.com/Mbussonn/status/533566917727223808. By becoming a member, we wish to further increase our support of the PSF in its mission to promote, protect, and advance the Python programming language.

Fastly
Fastly provides the PSF with unlimited free CDN services, a dedicated IP block, and hosted certificates. We also provide the PSF with free Premium Support. Over the last few months, Fastly’s comped services to the PSF totalled up to ~$20,000/month. In January 2015 alone, the PSF sent 1.7 billion requests and 132 TB through Fastly.
Python is a the go-to language at Fastly for building developer tools. Python allows Fastly to rapidly prototype and deploy novel protocols and services over multiple platforms, including devices like network switches, which are traditionally not programmable. Fastly relies on Python for data analysis and to dynamically reconfigure network switching and routing to steer every request to the closest available server. These tools are instrumental in helping Fastly reliably deliver more traffic in less time.

Infinite Code
Infinite Code is a software development firm with offices in Beijing, China and Kuala Lumpur, Malaysia. We are strong believers in Free/Open Source Software and the people centric principles of Agile Development. Our language of choice is Python for software development where possible. Our recent Python developments run the range from high volume, real money gaming platforms to massively parallel data gathering and transformation for large quantities of data. Our developers have been using Python since 2001.

I would love to hear from readers. Please send feedback, comments, or blog ideas to me at msushi@gnosis.cx.

March 13, 2015 03:45 PM

PyPy Development

Pydgin: Using RPython to Generate Fast Instruction-Set Simulators

Note: This is a guest blog post by Derek Lockhart and Berkin Ilbeyi from Computer Systems Laboratory of Cornell University.

In this blog post I'd like to describe some recent work on using the RPython translation toolchain to generate fast instruction set simulators. Our open-source framework, Pydgin [a], provides a domain-specific language (DSL) embedded in Python for concisely describing instruction set architectures [b] and then uses these descriptions to generate fast, JIT-enabled simulators. Pydgin will be presented at the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) and in this post we provide a preview of that work. In addition, we discuss some additional progress updates that occurred after the publishing deadline and will not appear in the final paper [1].

Our area of research expertise is computer architecture, which is perhaps an unfamiliar topic for some readers of the PyPy blog. Below we provide some brief background on hardware simulation in the field of computer architecture, as well as some context as to why instruction set simulators in particular are such an important tool.

Simulators: Designing Hardware with Software

For computer architects in both academia and industry, a key step in designing new computational hardware (e.g., CPUs, GPUs, and mobile system-on-chips) is simulation [c] of the target system. While numerous models for simulation exist, three classes are particularly important in hardware design.

Functional Level models simulate the behavior of the target system. These models are useful for creating a "golden" reference which can serve as an executable specification or alternatively as an emulation platform for software development.

Cycle Level models aim to simulate both the behavior and the approximate timing of a hardware component. These models help computer architects explore design tradeoffs and quickly determine things like how big caches should be, how many functional units are needed to meet throughput targets, and how the addition of a custom accelerator block may impact total system performance.

Register-Transfer Level (RTL) models specify the behavior, timing, and resources (e.g., registers, wires, logic gates) of a hardware component. RTL models are bit-accurate hardware specifications typically written in a hardware description language (HDL) such as Verilog or VHDL. Once verified through extensive simulation, HDL specifications can be passed into synthesis and place-and-route tools to estimate area/energy/timing or to create FPGA or ASIC prototypes.

An instruction set simulator (ISS) is a special kind of functional-level model that simulates the behavior of a processor or system-on-chip (SOC). ISSs serve an important role in hardware design because they model the instruction set architecture (ISA) interface: the contractual boundary between hardware designers and software developers. ISSs allow hardware designers to quickly experiment with adding new processor instructions while also allowing software developers to build new compilers, libraries, and applications long before physical silicon is available.

Instruction-Set Simulators Must be Fast and Productive

Instruction-set simulators are more important than ever because the ISA boundary has become increasingly fluid. While Moore's law has continued to deliver larger numbers of transistors which computer architects can use to build increasingly complex chips, limits in Dennard scaling have restricted how these transistors can be used [d]. In more simple terms, thermal constraints (and energy constraints in mobile devices) have resulted in a growing interest in pervasive specialization: using custom accelerators to more efficiently perform compute intensive tasks. This is already a reality for designers of mobile SOCs who continually add new accelerator blocks and custom processor instructions in order to achieve higher performance with less energy consumption. ISSs are indispensable tools in this SOC design process for both hardware architects building the silicon and software engineers developing the software stack on top of it.

An instruction set simulator has two primary responsibilities: 1) accurately emulating the external execution behavior of the target, and 2) providing observability by accurately reproducing the target's internal state (e.g., register values, program counter, status flags) at each time step. However, other qualities critical to an effective ISS are simulation performance and designer productivity. Simulation performance is important because shorter simulation times allow developers to more quickly execute and verify large software applications. Designer productivity is important because it allows hardware architects to easily experiment with adding new instructions and estimate their impact on application performance.

To improve simulation performance, high-performance ISSs use dynamic binary translation (DBT) as a mechanism to translate frequently visited blocks of target instructions into optimized sequences of host instructions. To improve designer productivity, many design toolchains automatically generate ISSs from an architectural description language (ADL): a special domain-specific language for succinctly specifying instruction encodings and instruction semantics of an ISA. Very few existing systems have managed to encapsulate the design complexity of DBT engines such that high-performance, DBT-accelerated ISSs could be automatically generated from ADLs [e]. Unfortunately, tools which have done so are either proprietary software or leave much to be desired in terms of performance or productivity.

Why RPython?

Our research group learned of the RPython translation toolchain through our experiences with PyPy, which we had used in conjunction with our Python hardware modeling framework to achieve significant improvements in simulation performance [2]. We realized that the RPython translation toolchain could potentially be adapted to create fast instruction set simulators since the process of interpreting executables comprised of binary instructions shared many similarities with the process of interpreting bytecodes in a dynamic-language VM. In addition, we were inspired by PyPy's meta-tracing approach to JIT-optimizing VM design which effectively separates the process of specifying a language interpreter from the optimization machinery needed to achieve good performance.

Existing ADL-driven ISS generators have tended to use domain-specific languages that require custom parsers or verbose C-based syntax that distracts from the instruction specification. Creating an embedded-ADL within Python provides several benefits over these existing approaches including a gentler learning curve for new users, access to better debugging tools, and easier maintenance and extension by avoiding a custom parser. Additionally, we have found that the ability to directly execute Pydgin ISA descriptions in a standard Python interpreter such as CPython or PyPy significantly helps debugging and testing during initial ISA exploration. Python's concise, pseudocode-like syntax also manages to map quite closely to the pseudocode specifications provided by many ISA manuals [f].

The Pydgin embedded-ADL

Defining a new ISA in the Pydgin embedded-ADL requires four primary pieces of information: the architectural state (e.g. register file, program counter, control registers), the bit encodings of each instruction, the instruction fields, and the semantic definitions for each instruction. Pydgin aims to make this process as painless as possible by providing helper classes and functions where possible.

For example, below we provide a truncated example of the ARMv5 instruction encoding table. Pydgin maintains encodings of all instructions in a centralized encodings data structure for easy maintenance and quick lookup. The user-provided instruction names and bit encodings are used to automatically generate decoders for the simulator. Unlike many ADLs, Pydgin does not require that the user explicitly specify instruction types or mask bits for field matching because the Pydgin decoder generator can automatically infer decoder fields from the encoding table.

encodings = [
  ['adc',      'xxxx00x0101xxxxxxxxxxxxxxxxxxxxx'],
  ['add',      'xxxx00x0100xxxxxxxxxxxxxxxxxxxxx'],
  ['and',      'xxxx00x0000xxxxxxxxxxxxxxxxxxxxx'],
  ['b',        'xxxx1010xxxxxxxxxxxxxxxxxxxxxxxx'],
  ['bl',       'xxxx1011xxxxxxxxxxxxxxxxxxxxxxxx'],
  ['bic',      'xxxx00x1110xxxxxxxxxxxxxxxxxxxxx'],
  ['bkpt',     '111000010010xxxxxxxxxxxx0111xxxx'],
  ['blx1',     '1111101xxxxxxxxxxxxxxxxxxxxxxxxx'],
  ['blx2',     'xxxx00010010xxxxxxxxxxxx0011xxxx'],
  # ...
  ['teq',      'xxxx00x10011xxxxxxxxxxxxxxxxxxxx'],
  ['tst',      'xxxx00x10001xxxxxxxxxxxxxxxxxxxx'],
]

A major goal of Pydgin was ensuring instruction semantic definitions map to ISA manual specifications as much as possible. The code below shows one such definition for the ARMv5 add instruction. A user-defined Instruction class (not shown) specifies field names that can be used to conveniently access bit positions within an instruction (e.g. rd, rn, S). Additionally, users can choose to define their own helper functions, such as the condition_passed function, to create more concise syntax that better matches the ISA manual.

def execute_add( s, inst ):
  if condition_passed( s, inst.cond() ):
    a,   = s.rf[ inst.rn() ]
    b, _ = shifter_operand( s, inst )
    result = a + b
    s.rf[ inst.rd() ] = trim_32( result )

    if inst.S():
      if inst.rd() == 15:
        raise FatalError('Writing SPSR not implemented!')
      s.N = (result >> 31)&1
      s.Z = trim_32( result ) == 0
      s.C = carry_from( result )
      s.V = overflow_from_add( a, b, result )

    if inst.rd() == 15:
      return

  s.rf[PC] = s.fetch_pc() + 4

Compared to the ARM ISA Reference manual shown below, the Pydgin instruction definition is a fairly close match. Pydgin's definitions could certainly be made more concise by using a custom DSL, however, this would lose many of the debugging benefits afforded to a well-supported language such as Python and additionally require using a custom parser that would likely need modification for each new ISA.

if ConditionPassed(cond) then
   Rd = Rn + shifter_operand
   if S == 1 and Rd == R15 then
     if CurrentModeHasSPSR() then CPSR = SPSR
   else UNPREDICTABLE else if S == 1 then
     N Flag = Rd[31]
     Z Flag = if Rd == 0 then 1 else 0
     C Flag = CarryFrom(Rn + shifter_operand)
     V Flag = OverflowFrom(Rn + shifter_operand)

Creating an ISS that can run real applications is a rather complex task, even for a bare metal simulator with no operating system such as Pydgin. Each system call in the C library must be properly implemented, and bootstrapping code must be provided to set up the program stack and architectural state. This is a very tedious and error prone process which Pydgin tries to encapsulate so that it remains as transparent to the end user as possible. In future versions of Pydgin we hope to make bootstrapping more painless and support a wider variety of C libraries.

Pydgin Performance

In order to achieve good simulation performance from Pydgin ISSs, significant work went into adding appropriate JIT annotations to the Pydgin library components. These optimization hints, which allow the JIT generated by the RPython translation toolchain to produce more efficient code, have been specifically selected for the unique properties of ISSs. For the sake of brevity, we do not talk about the exact optimizations here but a detailed discussion can be found in the ISPASS paper [1]. In the paper we evaluate two ISSs, one for a simplified MIPS ISA and another for the ARMv5 ISA, whereas below we only discuss results for the ARMv5 ISS.

The performance of Pydgin-generated ARMv5 ISSs were compared against several reference ISSs: the gem5 ARM atomic simulator (gem5), interpretive and JIT-enabled versions of SimIt-ARM (simit-nojit and simit-jit), and QEMU. Atomic models from the gem5 simulator were chosen for comparison due their wide usage amongst computer architects [g]. SimIt-ARM was selected because it is currently the highest performance ADL-generated DBT-ISS publicly available. QEMU has long been held as the gold-standard for DBT simulators due to its extremely high performance, however, QEMU is generally intended for usage as an emulator rather than a simulator [c] and therefore achieves its excellent performance at the cost of observability. Unlike QEMU, all other simulators in our study faithfully track architectural state at an instruction level rather than block level. Pydgin ISSs were generated with and without JITs using the RPython translation toolchain in order to help quantify the performance benefit of the meta-tracing JIT.

The figure below shows the performance of each ISS executing applications from the SPEC CINT2006 benchmark suite [h]. Benchmarks were run to completion on the high-performance DBT-ISSs (simit-jit, pydgin-jit, and QEMU), but were terminated after only 10 billion simulated instructions for the non-JITed interpretive ISSs (these would require many hours, in some cases days, to run to completion). Simulation performance is measured in MIPS [i] and plotted on a log scale due to the wide variance in performance. The WHMEAN group summarizes each ISS's performance across all benchmarks using the weighted harmonic mean.

A few points to take away from these results:

ISSs without JITs (gem5, simit-nojit, and pydgin-nojit) demonstrate relatively consistent performance across applications, whereas ISSs with JITs (simit-jit, pydgin-jit, and QEMU) demonstrate much greater performance variability from application-to-application.
The gem5 atomic model demonstrates particularly miserable performance, only 2-3 MIPS!
QEMU lives up to its reputation as a gold-standard for simulator performance, leading the pack on nearly every benchmark and reaching speeds of 240-1120 MIPS.
pydgin-jit is able to outperform simit-jit on four of the applications, including considerable performance improvements of 1.44–1.52× for the applications 456.hmmer, 462.libquantum, and 471.omnetpp (managing to even outperform QEMU on 471.omnetpp).
simit-jit is able to obtain much more consistent performance (230-459 MIPS across all applications) than pydgin-jit (9.6-659 MIPS). This is due to simit-jit's page-based approach to JIT optimization compared to pydgin-jit's tracing-based approach.
464.h264ref displays particularly bad pathological behavior in Pydgin’s tracing JIT and is the only application to perform worse on pydgin-jit than pydgin-nojit (9.6 MIPS vs. 21 MIPS).

The pathological behavior demonstrated by 464.h264ref was of particular concern because it caused pydgin-jit to perform even worse than having no JIT at all. RPython JIT logs indicated that the reason for this performance degradation was a large number of tracing aborts due to JIT traces growing too long. However, time limitations before the publication deadline prevented us from investigating this issue thoroughly.

Since the deadline we've applied some minor bug fixes and made some small improvements in the memory representation. More importantly, we've addressed the performance degradation in 464.h264ref by increasing trace lengths for the JIT. Below we show how the performance of 464.h264ref changes as the trace_limit parameter exposed by the RPython JIT is varied from the default size of 6000 operations.

By quadrupling the trace limit we achieve an 11x performance improvement in 464.h264ref. The larger trace limit allows the JIT to optimize long code paths that were previously triggering trace aborts, greatly helping amortize the costs of tracing. Note that arbitrarily increasing this limit can potentially hurt performance if longer traces are not able to detect optimizable code sequences.

After performing similar experiments across the applications in the SPEC CINT2006 benchmark suite, we settled on a trace limit of 400,000 operations. In the figure below we show how the updated Pydgin ISS (pydgin-400K) improves performance across all benchmarks and fixes the performance degradation previously seen in 464.h264ref. Note that the non-JITted simulators have been removed for clarity, and simulation performance is now plotted on a linear scale to more clearly distinguish the performance gap between each ISS.

With these improvements, we are now able to beat simit-jit on all but two benchmarks. In future work we hope to further close the gap with QEMU as well.

Conclusions and Future Work

Pydgin demonstrates that the impressive work put into the RPython translation toolchain, designed to simplify the process of building fast dynamic-language VMs, can also be leveraged to build fast instruction set simulators. Our prototype ARMv5 ISS shows that Pydgin can generate ISSs with performance competitive to SimIt-ARM while also providing a more productive development experience: RPython allowed us to develop Pydgin with only four person-months of work. Another significant benefit of the Pydgin approach is that any performance improvements applied to the RPython translation toolchain immediately benefit Pydgin ISSs after a simple software download and retranslation. This allows Pydgin to track the continual advances in JIT technology introduced by the PyPy development team.

Pydgin is very much a work in progress. There are many features we would like to add, including:

more concise syntax for accessing arbitrary instruction bits
support for other C libraries such as glibc, uClibc, and musl (we currently only support binaries compiled with newlib)
support for self-modifying code
features for more productive debugging of target applications
ISS descriptions for other ISAs such as RISC-V, ARMv8, and x86
automatic generation of compilers and toolchains from Pydgin descriptions

In addition, we think there are opportunities for even greater performance improvements with more advanced techniques such as:

automatic generation of optimized instruction decoders
optimizations for floating-point intensive applications
multiple tracing-JITs for parallel simulation of multicore SOCs
a parallel JIT compilation engine as proposed by Böhm et al. [3]

We hope that Pydgin can be of use to others, so if you try it out please let us know what you think. Feel free to contact us if you find any of the above development projects interesting, or simply fork the project on GitHub and hack away!

-- Derek Lockhart and Berkin Ilbeyi

Acknowledgements

We would like to sincerely thank Carl Friedrich Bolz and Maciej Fijalkowski for their feedback on the Pydgin publication and their guidance on improving the JIT performance of our simulators. We would also like to thank for the whole PyPy team for their incredible work on the PyPy and the RPython translation toolchain. Finally, thank you to our research advisor, Prof. Christopher Batten, and the sponsors of this work which include the National Science Foundation, the Defense Advanced Research Projects Agency, and Intel Corporation.

Footnotes

[a]

Pydgin loosely stands for [Py]thon [D]SL for [G]enerating [In]struction set simulators and is pronounced the same as “pigeon”. The name is inspired by the word “pidgin” which is a grammatically simplified form of language and captures the intent of the Pydgin embedded-ADL. https://github.com/cornell-brg/pydgin

[b]	Popular instruction set architectures (ISAs) include MIPs, ARM, x86, and more recently RISC-V

[c]	(1, 2) For a good discussion of simulators vs. emulators, please see the following post on StackOverflow: http://stackoverflow.com/questions/1584617/simulator-or-emulator-what-is-the-difference

[d]	http://en.wikipedia.org/wiki/Dark_silicon

[e]	Please see the Pydgin paper for a more detailed discussion of prior work.

[f]

For more examples of Pydgin ISA specifications, please see the ISPASS paper [1] or the Pydgin source code on GitHub.

Pydgin instruction definitions for a simple MIPS-inspired ISA can be found here:

https://github.com/cornell-brg/pydgin/blob/master/parc/isa.py

Pydgin instruction definitions for a simplified ARMv5 ISA can be found here:

https://github.com/cornell-brg/pydgin/blob/master/arm/isa.py

[g]	gem5 is a cycle-level simulation framework that contains both functional-level (atomic) and cycle-level processor models. Although primarily used for detailed, cycle-approximate processor simulation, gem5's atomic model is a popular tool for many ISS tasks. http://www.m5sim.org/SimpleCPU

[h]	All performance measurements were taken on an unloaded server-class machine.

[i]	Millions of instructions per second.

References

[1]

(1, 2, 3)

Derek Lockhart, Berkin Ilbeyi, and Christopher Batten. "Pydgin: Generating Fast Instruction Set Simulators from Simple Architecture Descriptions with Meta-Tracing JIT Compilers." IEEE Int'l Symp. on Performance Analysis of Systems and Software (ISPASS), Mar. 2015.

[2]

Derek Lockhart, Gary Zibrat, and Christopher Batten. "PyMTL: A Unified Framework for Vertically Integrated Computer Architecture Research." 47th ACM/IEEE Int'l Symp. on Microarchitecture (MICRO-47), Dec. 2014.

[3]	I. Böhm, B. Franke, and N. Topham. Generalized Just-In-Time Trace Compilation Using a Parallel Task Farm in a Dynamic Binary Translator. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Jun 2011.

March 13, 2015 02:14 PM

A. Jesse Jiryu Davis

Mongo Conduction: Or, What I Did For Spring SkunkWorks

MongoDB, Inc. holds quarterly skunkworks sessions—basically a hackathon, but more relaxed. We set aside three days to work on neat hacks, or to start deep projects that need uninterrupted concentration, or to do something new outside our regular duties.

For SkunkWorks last week I did three related projects:

MockupDB, a MongoDB Wire Protocol server written in Python.

Mongo Conduction, a server that receives Wire Protocol messages and creates test deployments of MongoDB servers. It looks sort of like a JSON-over-HTTP RESTful API, but what it actually does is a BSON-over-Wire-Protocol RESTful API.

A test-suite runner written in C. It reads our standard driver test specifications from YAML files, sends commands to Mongo Conduction to create the cluster, and connects the C Driver, libmongoc, to the cluster. It does operations with the driver, and sends more commands to Mongo Conduction to alter the cluster while the driver is connected to it, and asserts that the outcomes of the driver operations match the expected outcomes from the standard test.

In the demo I'm using CLion, a new C/C++ IDE.

If you use the closed captions I added, let me know if I did an ok job, it's my first time captioning a video.

March 13, 2015 10:44 AM

Luca Botti

Google Closing Google Code

I guess all of you will already know this, but Google is closing Google Code (announcement here). It's easy to say that this is another BigG Cloud Service closing, but Code demise was looming - and GitHub, BitBucket (my favourite) offers plenty of alternatives. If you have one (ore more) projects that are using from Code, this is the time to move. A button appears to move to GitHub. Even

March 13, 2015 08:42 AM

بايثون العربي

برمجة الشبكات في بايثون

سيكون هذا الموضوع مدخلنا الى برمجة المقبس (socket) باستخدام لغة بايثون , ويعتبر المقبس من الأساسيات وراء كل عملية اتصال بالشبكة يقوم بها الكمبيوتر وعلى سبيل المثال عندما نقوم بالاتصال بمحرك البحث ونكتب www.google.dz على المتصفح يقوم الجهاز بفتح مقبس ويتصل بالموقع لجلب الصفحة و إظهارها لنا، نفس العملية تنطبق على برامج الدردشة مثل gtalk او skype اي ان عملية اتصال بالشبكة تمر عبر المقبس.

في هذا الدرس سنقوم ببرمجة مقبس tcp باستخدام لغة البرمجة بايثون .

وقبل أن تبدأ يجب أن تكون على معرفة بأساسيات استخدام بايثون .

انشاء Socket

اولا علينا ان نقوم بانشاء المقبس والدالة socket.socket تقوم بهذا العمل

#مثال عن زبون المقبس في بايثون
import socket   #استدعاء مكتبة المقبس
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print 'Socket Created'

الدالة socket.socket تقوم بانشاء المقبس و اعادة إرجاع وصف المقبس حتى يمكننا استخدامها في دوال أخرى متعلقة بالمقبس.
والكود السابق يقوم بإنشاء مقبس بالإعدادات التالية :
AF_INIT خاص بالعنوان النسخة رقم 4 IPV4
SOCK_STREAM تهيئة المقبس بالاتصال باستخدام TCP
معالجة الأخطاء
إذا فشلت أي دالة من دوال المقبس سيقوم بايثون باستثناء يسمى socket.error

#معالجة الأخطاء في برمجة المقابس في بايثون
import socket   
import sys  #للخروج 
try:
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
except socket.error, msg:
    print 'Failed to create socket. Error code: ' + str(msg[0]) + ' , Error message : ' + msg[1]
    sys.exit();
 
print 'Socket Created'

إذا قمنا بإنشاء المقبس بنجاح ولكن ماذا بعد؟ الآن سنقوم بتجربة الاتصال بإحدى السيرفرات باستخدام هذا المقبس وليكن www.google.com
ملاحظة
ماعدا sock_stream هناك دالة أخرى وهي SOCK_DGRAM وهي خاصة ببروتوكول UDP وهذا النوع من المقابس غير قابل للاتصال وفي درسنا هذا سنتعامل مع SOCK_STREAM الخاص ببروتوكول TCP

برمجة مقبس udp في بايثون

الاتصال بالسيرفر
سنقوم بالاتصال بسيرفر بعيد على رقم منفذ معين، إذا نحن بحاجة الى شيئن عنوان IP ورقم المنفذ للاتصال به ، إذا عليك بمعرفة عنوان السيرفر الذي تريد الاتصال به وفي هذا المثال سنقوم باستخدام عنوان www.google.com كعينة .
الحصول على عنوان السيرفر
قبل الاتصال بالسيرفر نحن بحاجة الى عنوان ip الخاص به وللحصول عليه باستخدام بايثون نقوم بالتالي :

 import socket   #for sockets
import sys  #for exit 
try:
    #create an AF_INET, STREAM socket (TCP)
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
except socket.error, msg:
    print 'Failed to create socket. Error code: ' + str(msg[0]) + ' , Error message : ' + msg[1]
    sys.exit();
print 'Socket Created' 
host = 'www.google.com'
try:
    remote_ip = socket.gethostbyname( host )
 
except socket.gaierror:
    #could not resolve
    print 'Hostname could not be resolved. Exiting'
    sys.exit()     
print 'Ip address of ' + host + ' is ' + remote_ip

الآن وبعد حصولنا على العنوان الخاص بالسيرفر نستطيع الاتصال به عبر منفذ معين باستخدام الدالة connect

import socket   #for sockets
import sys  #for exit
 
try:
    #create an AF_INET, STREAM socket (TCP)
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
except socket.error, msg:
    print 'Failed to create socket. Error code: ' + str(msg[0]) + ' , Error message : ' + msg[1]
    sys.exit();
 
print 'Socket Created'
 
host = 'www.google.com'
port = 80
 
try:
    remote_ip = socket.gethostbyname( host )
 
except socket.gaierror:
    #could not resolve
    print 'Hostname could not be resolved. Exiting'
    sys.exit()
     
print 'Ip address of ' + host + ' is ' + remote_ip
 
#Connect to remote server
s.connect((remote_ip , port))
 
print 'Socket Connected to ' + host + ' on ip ' + remote_ip

قم بتشعيل البرنامج

python client.py $

Socket Created

Ip address of www.google.com is 74.125.236.83

Socket Connected to www.google.com on ip 74.125.236.83

تم إنشاء المقبس والاتصال ، قم بتجريب الاتصال بمنفذ أخر غير المنفذ 80 حيث لا يمكنك الاتصال وهذت يعني ان هذا المنفذ غير مفتوح للاتصال وهذا منطقي (يفترض انك ملم بالشبكات طبعا) وهذا يحتم علينا استخدام برنامج ماسح المنافذ.

إلي هنا ينتهي الدرس الأول حيث قمت بتقسيم الموضوع إلى عدة دروس حتى يتسنى لك عزيزي القارئ الاستيعاب والفهم وطبعا حاول أن تقوم بالاتصال على سيرفرات أخرى وقم بالبحث على الأنترنيت لمزيد من المعلومات ولا تكتفي بهذا الدرس .
في الدرس المقبل سنقوم بتعلم كيفية إرسال واستقبال البيانات إلى السيرفر و إلى ذلك الوقت في أمان الله .

برمجة الشبكات في بايثون ارسال واستقبال البيانات

March 13, 2015 07:31 AM

March 12, 2015

Mauveweb

New, free Python Jobs board

Recently we've been on a recruitment drive, trying to fill a number of roles for experienced Python developers. The Python.org jobs board has been frozen for a while, so to assist us in meeting new candidates we tossed around ideas for a free, community-run jobs board: it would have to be a static site; it would have to be on Github; employers should be able to list a job just by submitting a pull request. And then Steve went ahead and wrote it!:

http://pythonjobs.github.io/

Please, please bookmark it, tweet it, reblog it, even if you're not looking for a job right now. It only works if it gets eyeballs. And of course, it's completely free, for everyone, forever. It's by Pythonistas, for Pythonistas.

If you are hiring (and are not a recruitment agent), knock up a Markdown file describing the role you're looking to fill (plus some metadata) and send us a pull request. Instructions are in the Github README.

We'll accept job listings from anywhere in the world. Sure it's not very easy to navigate by region yet. That may be the next job. Perhaps you could help out - pull requests don't have to be limited to new job postings, hint, hint! (Build machinery/templates are in this repo).

On a personal note I want everyone in this community to be employed, happy, and making a comfortable living. Perhaps this site can help make that happen? I'd love to hear your feedback/experiences; use the Disqus gizmos below.

Update 9:45pm UTC: Talking to the team, I discover I'm mistaken: we're actually going to allow recruiters to post job opportunities, providing they do all the work in sending us a pull request and include full relevant details such as the identity of the employer.

March 12, 2015 07:35 PM

Graham Dumpleton

Using wrapt to support testing of software.

When talking about unit testing in Python, one of the more popular packages used to assist in that task is the Mock package. I will no doubt be labelled as a heretic but when I have tried to use it for things it just doesn't seem to sit right with my way of thinking. It may also just be that what I am trying to apply it to isn't a good fit. In what I want to test it usually isn't so much that I

March 12, 2015 05:50 PM

Python Software Foundation

BBC launches MicroBit

This morning, PSF Fellow Nicholas Tollervey of the UK posted the following to the PSF Members List:

"Today the BBC announced the MicroBit (part designed by [Pythonista and friend] Michael Sparks) - http://www.bbc.co.uk/news/technology-31834927.

About 1 million of these small programmable devices will be given away to 11-12 year olds starting their secondary education at the start of the UK’s next academic year in September.

Students will use the devices to learn programming and to create games. Python is one of the three languages that work with the device.

The PSF is involved in helping to generate community sourced Python resources for the project and we hope the MicroBit will be a big part of PyconUK’s education track (taking place at the end of September).

In addition to the BBC article above, the Guardian gives a good summary and mentions the PSF here:

http://www.theguardian.com/technology/2015/mar/12/bbc-micro-bit-raspberry-pi

I have an alpha-version of the device sitting on my desk and my impression is that kids will have a lot of fun. Think Pythonic blinkenlights, buttons, bluetooth and IO."

Image credit wired.co.uk

BBC director-general Tony Hall, speaking at the program’s launch, had this to say: "The BBC, our partners and everyone involved want this to be a defining moment for digital creativity, and a vital one for our country’s digital economy."

The MicroBit is not intended to compete with other devices. In fact, The Raspberry Pi Foundation is involved in creating learning content for the device, and the final version of the MicroBit will be able to connect via Bluetooth with Raspberry Pis and other computers, including Kano and Arduino. One goal is to teach children to write code in collaboration with others, so interconnectivity will be important.

According to the BBC article, the initiative to increase computer skills among UK school children is motivated by need, "with 1.4 million digital professionals estimated to be needed over the next five years." To answer this need, the Make it Digital Initiative is a group effort, involving approximately 50 other organizations including Microsoft, Google, Code Club, and the PSF.

A list of other participating organizations can be seen here, along with Nicholas' statement explaining the PSF’s participation:

"Education is a fundamental aspect of the continuing development of the Python programming language. The Python Software Foundation (PSF) and wider Python programming community fully support BBC Make it Digital’s efforts to encourage, engage and inspire the programmers of tomorrow. The Micro Bit is a fuse to ignite an explosion in digital creativity and we’re delighted to be a partner in such Python programming pyromania."

March 12, 2015 03:43 PM

PyCharm

Best Interactive Programming Course Contest 2015 is over!

On March 5th the Best Interactive Programming Course contest organized by JetBrains came to a close. Based around the theme of PROGRAMMING EDUCATION WITH PYTHON, this contest was a great chance for teachers and instructors all around the globe to show their experience and skills as they vowed to create a programming course that thousands of learners would use inside PyCharm Educational Edition.

We hope it was a great experience for everyone who attempted to create a course using PyCharm Educational Edition. The courses this contest has produced will surely be helpful for thousands of students around the globe.

The Results

All submitted courses have been scored on:

The overall idea (engaging topic, originality/uniqueness)
Seamless experience going through the course
Course length and structure
Usage of PyCharm Edu functionality
Course content quality

And the winners are:
1st place: John Zurawski, “Logging in Python”
This course provides an introduction to the standard Python logging module. It includes basic and advanced examples that can help you debug your Python software. Best practices are demonstrated to help you get the most out of Python logging. This course was tested using Python 2.7 and Python 3.4.

2nd place: Lisa C, “Introduction to Classic Ciphers”
Python implementations of classic text ciphers. Appropriate for python beginners who have had some practice with manipulating strings and lists, writing ‘for’ loops, and organizing code into functions.

3rd place: Tal Einat, “Python Unit-Testing”
An introductory interactive course about unit testing in Python.

The jury also selected “Introduction to Classic Ciphers” as the Best Course Idea.

Thanks to everyone who entered in the contest and congratulations to the winners!

The winning courses will soon be available in PyCharm Educational Edition along with the default “Introduction to Python” course. To check them out, go to File | New Project | Educational and click the refresh button.

This has been an amazing experience for the PyCharm team. Thanks a lot for all your entries and keep the love flowing!

What’s up next?

Currently we’re working on the next version of PyCharm Educational Edition. It’s going to be more polished, the new features will be introduced and of course with this new release we’ll address different usability and some common problems that both students and educators have experienced with the first version!

Stay tuned for further news, subscribe to both PyCharm and PyCharm Educational Edition twitters, report any problems you find to our public issue tracker.

Develop and learn with pleasure!
PyCharm team

March 12, 2015 03:30 PM

بايثون العربي

كيفية استخدام بروتوكول نقل الملفات (FTP) في بايثون

هناك الكثير من الطرق التي تسمح لنا بتحميل الملفات من الانترنيت باستخدام بايثون والطريقة الأكثر الشعبية هي الاتصال بسيرفر ftp وتحميل الملفات وهذا ما سنقوم بشرحه في هذه التدوينة حتى نتعرف على كيفية استعمال بروتوكول FTP في بايثون بمساعدة الوحدة ftplib.
تسمح لنا هذه الوحدة بكتابة برامج تقوم بوظائف متنوعة الخاصة ب بروتوكول ftp ، حيث يمكننا الاتصال بسيرفر ftp وتحميل الملفات ومعالجتها على جهاز الكمبيوتر ولا داعي الى تحميل هذه المكتبة لانها موجودة مسبقا وهي تحتوي على جميع الدوال التي تساعدنا على العمل.
وحتى نستفيد من جميع خصائص وحدة ftplib يجب علينا ان نقوم باستدعاء الوحدة الى برنامجنا باستخدام import

import ftplib

بعد استدعاء الوحدة علينا ان نقوم بفتح اتصال بيننا وبين سيرفر ftp وقبل ذلك وجب انشاء كائن ، وبعد فتح الاتصال يمكننا استخدام جميع دوال الموجودة في وحدة ftplib.
يوجد نوعين من الدوال الموجودة على وحدة ftplib ، الأولى للتعامل مع الملفات النصية والاخرى للتعامل مع الملفات الثنائية (binary).
ويمكننا طبعا التنقل بكل سهولة في السيرفر وادارة وتحميل الملفات.
سنقوم بأخذ مثال نشرح عليه العمليات الاساسية والمتعلقة ب ftp من اتصال بالسيرفر وعرض الملفات والدلائل الموجودة عليه والتنقل من دليل الى أخر.

import ftplib # استدعاء وحدة ftplib

server_name="ftp.novell.com"
username="anonymous"
password="bwdayley@novell.com"

ftp = ftplib.FTP(server_name, username, password)

print 'Welcome', ftp.getwelcome()#استقبال رسالة ترحيبية من السيرفر
 
print ("File List:")
files = ftp.dir()# عرض جميع محتويات السيرفر من ملفات ودلائل
 
ftp.cwd("/forge")# الانتقال الى دليل فرعي 
 
print ('Forge files are :')
_file=ftp.dir()#عرض محتويات الدليل الفرعي
ftp.quit()

بعد تشغيل البرنامج سيقوم بعرض لنا التالي :

حالة الاتصال
حتى نتاكد من اننا قد اتصلنا بالسيرفر بنجاح يمكننا استعمال الدالة التالية ftp.getwelcome() حيث تقوم بارسال لنا رسالة ترحيبية من السيرفر ومثالنا السابق سنحصل على رسالة مثل :

رفع ملف الى السيرفر
طبعا الغاية من الاتصال بالسيرفر ليس الاطلاع على الملفات وفقط ، حيث سنحتاج الى رفع الملفات وتحميلها في نفس الوقت ، سنقوم الان برفع ملف السيرفر .
ساقوم بكتابة كود بسيط جدا حتى نفهمه معا.

ملاحظة : ليس من الضروري ان يكون الملفا المراد حمله الى السيرفر موجود في نفس الدليل مع ملف برنامجنا

fichier = "/home/kader/Desktop/example.sh"#الملف الذي اريد رفعه
file1= open(fichier, 'rb') #نقوم بفتح الملف
ftp.storbinary('STOR '+fichier, file1) # هنا نقوم بارسال الملف الى السيرفر
file1.close() # غلق الملف
ftp.quit()

بعدما قمنا بارسال الملف الى السيرفر دعونا نقوم بشرح السطر الذي قام بهذه العملية : STOR '+fichier ' .
عليك أن تعرف ان 'STOR' هو أمر خاص ببروتوكول ftp وليس أمر بايثون ومن خلال هذا الأمر نحن نقول للسيرفران ماتفعله الان مهم جدا وعليك بتخزين الملف التالي STOR /home/kader/Desktop/example.sh

إعادة تمسية الملفات والمجلدات
قد نحتاج احيانا الى اعادة تسمية الملفات والمجلدات الموجودة على السيرفر ولهذا دعونا نكتشف كيفية عمل ذلك.

rename = ftp.rename("old name", "new name")

وكما رأيتم يمكن تغيير أسماء الملفات بسطر واحد . الأن سأقوم بكتابة جميع العمليات الشائعة على شكل كود بسيط .

# حذف ملف
delete=ftp.delete("File name")

#إنشاء مجلد أودليل جديد
rep=ftp.mkd("Directoty name")

#حذف مجلد أو دليل
del_dir=ftp.rmd("Directory name")

سأقوم فيما بعد بكتابة برنامج كامل واقوم بمشاركتم الكود للاستفادة.

March 12, 2015 02:32 PM

Europython

EuroPython 2015: Call for proposal dates available

The Program work group (WG) has decided on the dates for the Call for Proposal (CFP) dates:

Monday, 2015-03-16 — Tuesday, 2015-04-14

You will be able to submit your proposals through the EuroPython website during these 4 weeks.

We have these types of presentations available for submission:

Talks: 170 slots available (80x 30min, 85x 45min, 5x 60min)
Trainings: 20 slots
Posters: 25 slots
Help desks: 5 slots

Please note that the exact number of submissions we can accept depends on schedule and room requirements, so the above numbers are only estimates. Talk times include time for questions.

The full Call for Proposal with all details will be made available on Monday, 2015-03-16. We are publishing these dates early because we’ve been getting a lot of requests for the CFP dates.

Talks/Trainings in Spanish and Basque

Since EuroPython is hosted in Bilbao and EuroPython has traditionally always been very open to the local Python communities, we are also accepting a number of talks and trainings in Spanish and Basque.

All other talks/trainings should be held in English.

Talk voting

As in 2013, we will again have talk voting, which means that attendees who have already registered will get to see the talk submissions and can vote on them. The Program WG will also set aside a number of slots which they will then select based on other criteria to e.g. increase diversity or give a chance to less mainstream topics.

The schedule will then be announced early in May.

Enjoy,
—
EuroPython 2015 Team

March 12, 2015 01:13 PM

Captain DeadBones'' Chronicles

Have you tried out Spyder-IDE for Python Dev?

Personally, I am not much on an IDE person. I love a good plain text editor and a command line tool. However, Spyder is not that bad. You can download it from the Spyder bit bucket repository. Give it a try. It works right out of the box. There are 3 main sub windows, a color coded text editor, a variable table and a console output. The first time you open it you should see something like this:

spyder_ide_first_run

Let's run a sample program, something like this:

#trial run Spyder IDE

x = input("Enter a number: ")
y = raw_input("Enter another number: ")

try:
    z = x*y
except:
    print "ERROR! Incompatible types"

print z

print x*int(y)

Now when we hit the run button magic is going to happen. We are going to have to enter user input in the console area. After that magic! our variable, values and data type, show up on the variable panel. This saves a ton of time and effort when it comes to development and debugging. Here is how this all looks:

spyder_first_impressions

Personally, I love this little app. It is great! Now is it enough to make me transitions from my old ways? Maybe. I am not fully convinced. I am going to need to play around with some more. I can see great potentials in this app. New IDEs are coming out everyday now, this is one of the best. It is the only one I would consider giving a fair chance.

The post Have you tried out Spyder-IDE for Python Dev? appeared first on Captain DeadBones Chronicles.

March 12, 2015 11:22 AM

eGenix.com

eGenix mxODBC 3.3.2 GA

Introduction

mxODBC ™ provides an easy-to-use, high-performance, reliable and robust Python interface to ODBC compatible databases such as MS SQL Server, Oracle Database, IBM DB2, Informix and Netezza, SAP Sybase ASE and Sybase Anywhere, Teradata, MySQL, MariaDB, PostgreSQL, SAP MaxDB and many more:

>>> mxODBC Product Page

The eGenix mxODBC - Python ODBC Database Interface product is a commercial extension to our open-source eGenix mx Base Distribution:

>>> mx Base Distribution Page

News

The 3.3.2 release of our mxODBC is a patch level release of our popular Python ODBC Interface for Windows, Linux, Mac OS X and FreeBSD. It includes these enhancements and fixes:

Driver Compatibility:

MS SQL Server

Fixed an "ODBC driver sent negative string size" error when using empty strings or None with output parameters for SQL Server ODBC drivers.
Clarified that due to the way the SQL Server ODBC driver sends data, mixing output parameters and output result sets is not possible. A work-around for this is to send back output parameters as additional result set.

SAP Sybase ASE

Added a work-around for the Sybase ASE ODBC driver which has problems with BIGINT columns. These are now supported.
Fixed a possible "ODBC driver sent negative string size" error when using empty strings or None with output parameters for Sybase ASE ODBC drivers.

Misc:

Fixed the handling of None as default value for output parameters in e.g. stored procedures to use VARCHAR binding rather than CHAR binding. The latter caused padding with some database backends.
Changed cursor.colcount to be determined on-demand rather than right after the prepare step of statement execution.
Fixed an issue with mxODBC triggering unwanted ODBC errors after the prepare step when calling a stored procedure. These were not reported, but do show up in the ODBC log.
Fixed some minor issues with the package web installer.

The complete list of changes is available on the mxODBC changelog page.

Features

mxODBC 3.3 was released on 2014-04-08. These are the highlights of the new release:

mxODBC 3.3 Release Highlights

Stored Procedures

mxODBC now has full support for input, output and input/output parameters in stored procedures and stored functions, allowing easy integration with existing databases systems.

User Customizable Row Objects

Added support for user customizable row objects by adding cursor/connection .rowfactory and .row constructor attributes. When set, these are used to wrap the normal row tuples returned by the .fetch*() methods into dynamically created row objects.
Added new RowFactory classes to support cursor.rowfactory and cursor.row. These allow dynamically creating row classes that provide sequence as well as mapping and attribute access to row fields - similar to what namedtuples implement, but specific to result sets.

Fast Cursor Types

Switched to forward-only cursor types for all database backends, since this provides a much better performance for MS SQL Server and IBM DB2 drivers.
Added a new .cursortype attribute to allow adjusting and inspecting the ODBC cursor type to be used for an mxODBC cursor object. Default is to use forward-only cursors, but mxODBC also support several other useful cursor types such as static cursors with full support for result set scrolling.

mxODBC 3.3 Driver Compatibility Enhancements

Oracle

Added work-around for Oracle Instant Client to be able to use integer output parameters.
Added a work-around for Oracle Instant Client to have it return output parameters based on the input placeholder Python parameter types. It would otherwise return all parameters as strings.

MS SQL Server

mxODBC now defaults to 100ns connection.timestampresolution for MS SQL Server 2008 and later, and 1ms resolution for MS SQL server 2005 and earlier. This simplifies interfacing to SQL Server timestamp columns by preventing occasional precision errors.
Tested mxODBC successfully with new MS SQL Server Native Client 11 for Linux. Unicode connection strings still don't work, but everything else does.
Added documentation on how to use Kerberos with mxODBC and SQL Server fo authentication on both Windows and Linux.

Sybase ASE

Added work-around for the Sybase ASE ODBC driver, which doesn't always pass back NULL correctly to mxODBC on 64-bit Unix systems.
Changed the variable type binding mode default for the Sybase ASE ODBC driver from Python type binding to SQL type binding, which resolves issues with e.g. the Unicode support for that driver.

IBM DB2

Added work-around for the IBM DB2 ODBC driver, which doesn't always pass back NULL correctly to mxODBC on 64-bit Unix systems.

PostgreSQL

Added work-around to force Python type binding for the PostgreSQL ODBC drivers. More recent versions of the driver report supporting SQL type binding, but they don't implement it.
Added work-around to have PostgreSQL ODBC drivers properly work with binary data for BYTEA columns.

MySQL

mxODBC now supports native Unicode with the recent MySQL ODBC drivers - provided you use the Unicode variants of the drivers.
Changed the default binding mode for MySQL ODBC drivers to Python type binding. This works around a problem with date/time values when talking to MySQL 5.6 servers.

For the complete set of features, please have a look at the mxODBC product page.

Editions

mxODBC is available in these two editions:

The Professional Edition, which gives full access to all mxODBC features.
The Product Development Edition, which allows including mxODBC in applications you develop.

For a complete overview of the available editions, please see the product page.

Downloads

Please visit the eGenix mxODBC product page for downloads, instructions on installation and documentation of the packages.

Note that in order to use the eGenix mxODBC product, you first need to install our open-source eGenix mx Base Distribution.

You can also simply use:

pip install egenix-mxodbc

and then request 30-day evaluation licenses from our web-site.

Upgrading

Users are encouraged to upgrade to this latest mxODBC release to benefit from the new features and updated ODBC driver support.

We have taken special care not to introduce backwards incompatible changes, making the upgrade experience as smooth as possible.

Customers who have purchased mxODBC 3.3 licenses can continue to use their licenses with this patch level release.

For upgrade purchases, we will give out 20% discount coupons going from mxODBC 2.x to 3.3 and 50% coupons for upgrades from mxODBC 3.x to 3.3. Please contact the eGenix.com Sales Team with your existing license serials for details for an upgrade discount coupon.

If you want to try the new release before purchase, you can request 30-day evaluation licenses by visiting our web-site or writing to sales@egenix.com, stating your name (or the name of the company) and the number of eval licenses that you need.

More Information

For more information on the eGenix.com Python products, licensing and download instructions, please write to sales@egenix.com.

Enjoy !

Marc-Andre Lemburg, eGenix.com

March 12, 2015 10:00 AM

March 11, 2015

Ludovic Gasc

Benchmark Python Web production stack: Nginx with uWSGI, Meinheld and API-Hour

Disclaimer: If you have some bias and/or dislike AsyncIO, please read my previous blog post before starting a war.

Tip: If you don't have the time to read the text, scroll down to see graphics.

Summary of previous episodes

After the publication of “Macro-benchmark with Django, Flask and AsyncIO (aiohttp.web+API-Hour)”, I received a lot of remarks, this is a synthesis:

It’s impossible, you change the numbers/you don’t measure the right values/…: Come on people, if you don’t believe me, test by yourself: I’ve published as many pieces of information as possible to be reproducible by others in API-Hour repository. Don’t hesitate to ask me if you have issues to test by yourself.

Nginx is configured to avoid being a bottleneck for Python daemons.
Changing kernel parameters is a cheat: No, it isn't a cheat, most production applications recommend to do that, not only for benchmarks. Example with:

PostgreSQL: https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server shared_buffers config (BTW, I forgot to push my kernel settings for postgresql, it's now in the repository)
Nginx: http://wiki.nginx.org/SSL-Offloader section preparation

Daemons were all in debug mode: All daemons were impacted, I've disabled that. I've relaunched localhost benchmark on /agents endpoint, I get almost the same values, certainly because I've already disabled logging globally in my benchmarks.
You should disable middlewares in Django: On production, you would keep them, but nevertheless, I’ve disabled them to be fair with other frameworks where I don’t use middlewares.

wrk/wrk2 aren't good tool to benchmark HTTP, it can hit too hard: It's the goal of a benchmark to hit as hard as possible, isn't it ? FYI, almost all serious benchmarks reports on the Web use wrk. As frameworks performance in general is increasing, the tools used to challenge them have to hit stronger to bring out the differences.

Keep-alive isn’t enabled for Flask or Django / Nobody uses Flask or Django alone in prod, you must use Nginx and uWSGI/Meinheld: No problems: you’ll find below a new serie of benchmarks based on these remarks.

In this article, I’ll test three scenarios:

A resource limit use case: 4000 requests/s with wrk2 during 5 minutes
A standard use case: 50 simultaneous connections with wrk during 5 minutes
A slow use case: 10 requests/s with wrk2 during 30 seconds

To be closer with a production scenario, I’ll test only via the network and with agents list endpoint that uses a database connection, as described in my previous article.

I test 6 architectures:

Django+Meinheld+Nginx
Django+uWSGI+Nginx
Django+Meinheld+Nginx
Django+uWSGI+Nginx
API-Hour+Nginx
API-Hour without Nginx

As you can see, all are behind a Nginx server, except the last one, in order to serve as a control.

As usual, you can find config files in API-Hour repository: https://github.com/Eyepea/API-Hour/tree/master/benchmarks

Round 4: 4000 requests/s with wrk2

Requests by seconds
(Higher is better)

Latency (s)
(Lower is better)

As you can see, API-Hour without Nginx handles less requests by second, moreover:

Errors
(Lower is better)

API-Hour without Nginx has a lot of latency compared to others solutions, but see the explanation below.

At first sight, you seem to handle more requests with Nginx, and you even have less latency. This last point is intriguing: how, could you have less latency with API-Hour+NGINX, than with API-Hour alone ?

After careful examination, I’ve seen a lot of 404 responses on the wire. My understanding is that NGINX will return a 404 if the framework behind it doesn’t answer fast enough. A 404 is a response for wrk, and a response that comes quickly (the NGINX timeout to the backend is short). Wrk therefore can immediately launch another request. Hence you see more requests and less latency with NGINX, but (contrary the the control API-Hour W/o NGINX) many requests have in fact not been answered properly.

Array results

	Requests/s	Errors	Avg Latency (s)
Django+Meinheld	3992.68	1031238	0.121
Django+uWSGI	3991.96	1029213	0.072
Flask+Meinheld	3991.43	1024192	0.111
Flask+uWSGI	3994.09	1021953	0.066
API-Hour	3994.96	312600	0.043
API-Hour w/o Nginx	3646.15	0	9.74

To avoid these artefacts, the round 5 doesn’t try to “force-feed” frameworks with requests. instead, it tries to make as many requests as possible (launching one as soon as the previous was in answered), on 50 parallel connections. Now you the frameworks work properly on each request, so the error-rate is zero for all of them. Again, to be fair, I decreased the number of simultaneous connections until all framework had a zero error-rate, and I did not do any lower (thus this is the maximum # of connections for which all error-rates are zero).

Round 5: 50 simultaneous connections with wrk

Requests by seconds
(Higher is better)

Errors
(Lower is better)

Latency (s)
(Lower is better)

Array results

	Requests/s	Errors	Avg Latency (s)
Django+Meinheld	603.07	0	0.07977
Django+uWSGI	603.38	0	0.07958
Flask+Meinheld	623.85	0	0.07705
Flask+uWSGI	628.58	0	0.07655
API-Hour	3033.17	0	0.0161
API-Hour w/o Nginx	3610.96	0	0.01398

(Bonus) Round 6: 10 requests/s with wrk2 during 30 seconds

Not really interesting on a production environment, this test is only to validate that AsyncIO is interesting even with a small load.

With this round, I’ve no errors and all frameworks handle 10 requests/s.

Latency (s)
(Lower is better)

Array results

	Requests/s	Errors	Avg Latency (s)
Django+Meinheld	10	0	0.02142
Django+uWSGI	10	0	0.02083
Flask+Meinheld	10	0	0.01912
Flask+uWSGI	10	0	0.01896
API-Hour	10	0	0.00783
API-Hour w/o Nginx	10	0	0.00855

Conclusion

As demonstrated in my previous benchmark, API-Hour rocks.
Meinheld/uWSGI+Nginx help to increase performances and reduce error rate for Python sync frameworks, but the internal architecture of your application has more impact on performances than change an external component.

For API-Hour, it isn’t a good idea to have Nginx as a reverse proxy, because you add latency and you reduce performances, as you can see in round 5.

With API-Hour, you can use a subdomain to serve your static files with Nginx.

If a subdomain is not an option, you can also route the static traffic with a HAProxy and a specific URL (a folder), but this would probably impact the latency a bit as HAProxy has to open all packets to get the URL, and apply a regexp to know where to route it.

As a side note, I'd say that making technically sound and fair benchmarks is not easy: You don't just connect some tool on all tested frameworks, and present the figures.

Moreover, the interpretation of the results implies you precisely understand what you’re measuring. As you have seen above, some nasty side-effects can crawl in and give weird artifacts, let alone create unfair results for some. In many occasions, I had to pull out my wireshark and dissect what was on the wire to understand what was really going on.

I tried my best in these two articles to make these testings as error-free, honest and fair as possible, also integrating some clever (non-trolling) remarks I received from some of you. Again, the method is explained and all sources (as well as my help if needed) are publicly available if you think other factors or settings should be taken into account.

I'd welcome any relevant remarks or proven affirmations, just like I'll dismiss any ungrounded, troll-style, non-constructive ones.

March 11, 2015 10:21 PM

Graham Dumpleton

Safely applying monkey patches in Python.

Monkey patching in Python is often see as being one of those things you should never do. Some do regard it as a useful necessity you can't avoid in order to patch bugs in third party code. Others will argue though that with so much software being Open Source these days that you should simply submit a fix to the upstream package maintainer. Monkey patching has its uses well beyond just patching

March 11, 2015 07:02 PM

Logilab

Monitoring our websites before we deploy them using Salt

As you might have noticed we're quite big fans of Salt. One of the things that Salt enables us to do, it to apply what we're used to doing with code to our infrastructure. Let's look at TDD (Test Driven Development).

Write the test first, make it fail, implement the code, test goes green, you're done.

Apply the same thing to infrastructure and you get TDI (Test Driven Infrastructure).

So before you deploy a service, you make sure that your supervision (shinken, nagios, incinga, salt based monitoring, etc.) is doing the correct test, you deploy and then your supervision goes green.

Let's take a look at website supervision. At Logilab we weren't too satisfied with how our shinken/http_check were working so we started using uptime (nodejs + mongodb). Uptime has a simple REST API to get and add checks, so we wrote a salt execution module and a states module for it.

https://www.logilab.org/file/288174/raw/68747470733a2f2f7261772e6769746875622e636f6d2f667a616e696e6f74746f2f757074696d652f646f776e6c6f6164732f636865636b5f64657461696c732e706e67.png

For the sites that use the apache-formula we simply loop on the domains declared in the pillars to add checks :

{% for domain in salt['pillar.get']('apache:sites').keys() %}
uptime {{ domain }} (http):
  uptime.monitored:
    - name : http://{{ domain }}
{% endfor %}

For other URLs (specific URL such as sitemaps) we can list them in pillars and do :

{% for url in salt['pillar.get']('uptime:urls') %}
uptime {{ url }}:
  uptime.monitored:
    - name : {{ url }}
{% endfor %}

That's it. Monitoring comes before deployment.

We've also contributed a formula for deploying uptime.

Follow us if you are interested in Test Driven Infrastructure for we intend to write regular reports as we make progress exploring this new domain.

March 11, 2015 06:23 PM

Yasoob Khalid

A guide to finding books in images using Python and OpenCV.

This is a guest post by Adrian Rosebrock from PyImageSearch.com, a blog all about computer vision, image processing, and building image search engines.

I always tell people that if they want to learn to write well they need to do two things:

Practice writing often.
Read. A lot.

It seems strange, doesn’t it? How reading often can dramatically improve your writing ability.

But it’s absolutely true.

Reading authors that like you can actually engrain their vernacular into your own writing style. And eventually, with enough practice, you can develop a style and voice of your own.

All that said, between the PyImageSearch blog, my book Practical Python and OpenCV, and PyImageSearch Gurus (a computer vision course I’m developing), I write (and read) a lot.

There will be moments when I’m literally walking down the street, coffee in hand, when a stroke of inspiration will strike me like a bolt of lightning.

And less I let the fleeting thought disappear into the abyss of my subconscious, I have to stop in the middle of my walk, pull out my iPhone, and compose a blog post on my tiny screen and hard-to-use keyboard.

Is it a bit annoying at times? Yes but. But it’s a lot of fun. Both reading and writing are a passion of mine – and one passion fuels the other.

So it should come as no surprise that my coffee table is covered in books right now.

As I’m sitting here looking at my coffee table, I decided, hey, why not create a Python script to count the number of books on my table? That would be pretty cool, right? I could merge two of my passions – books and computer vision.

And when Yasoob invited me to do a guest post, I couldn’t help but accept. In the rest of this blog post I will show you how to create a Python script to count the number of books in an image using OpenCV.

What are we going to do?

Let’s start by taking a look at our example image we are going to count books in:

We see there are four books in the image, along with various “distractors” such as a coffee mug, a Starbucks cup, multiple coasters, and a piece of candy.

Our goal here is to find the four books in the image while ignoring the distractors.

How are we going to do that?

Read on to find out!

What libraries will we need?

In order to build our system to find and detect books in images, we’ll be utilizing two main libraries:

NumPy to facilitate numerical operations.
OpenCV to handle computer vision and image processing.

Make sure you have these libraries installed!

Finding books in images using Python and OpenCV.

Let’s go ahead and get started.

Open up your favorite code editor, create a new file named find_books.py, and let’s get started:

# import the necessary packages
import numpy as np
import cv2

# load the image, convert it to grayscale, and blur it
image = cv2.imread("example.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (3, 3), 0)
cv2.imshow("Gray", gray)
cv2.waitKey(0)

We’ll start by importing our required libraries. We’ll be using NumPy for numerical processing and cv2 for our OpenCV bindings.

Loading an image off disk is handled by the cv2.imread function. Here we are simply loading our image off disk, followed by converting it from the Red, Green, Blue (RGB) color space to grayscale.

We’ll also blur the image slightly to reduce high frequency noise and increase the accuracy of our code used to find books later in this post.

After executing our code, our output should look like this:

Here you can see that we have loaded the image off disk, converted it to grayscale, and blurred it slightly.

Now, let’s detect edges (i.e outlines) of the objects in the image:

# detect edges in the image
edged = cv2.Canny(gray, 10, 250)
cv2.imshow("Edged", edged)
cv2.waitKey(0)

Our edged image now looks like this:

We have clearly found the outlines of the objects in the images. However, you’ll notice that some of the outlines are not “clean” and complete. There are gaps in between the outlines that we need to close in order to successfully detect our books.

To solve this, we’ll apply a “closing” operation to close the gaps between the white pixels in the image:

# construct and apply a closing kernel to 'close' gaps between 'white'
# pixels
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (7, 7))
closed = cv2.morphologyEx(edged, cv2.MORPH_CLOSE, kernel)
cv2.imshow("Closed", closed)
cv2.waitKey(0)

Sure enough, the gaps in the outlines have been closed:

The next step is to actually detect the outlines of the objects in the image. We’ll use the cv2.findContours function for that:

# find contours (i.e. the 'outlines') in the image and initialize the
# total number of books found
(cnts, _) = cv2.findContours(closed.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
total = 0

Let’s take a second and consider the geometry of a book.

A book is a rectangle. And a rectangle has four vertices. Therefore, if we examine a contour and find that it has four vertices, then we can assume it is a book and not one of the distractors in the image.

To check if a contour is a book or not, we need to loop over each of the contours individually:

# loop over the contours
for c in cnts:
# approximate the contour
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02 * peri, True)

# if the approximated contour has four points, then assume that the
# contour is a book -- a book is a rectangle and thus has four vertices
if len(approx) == 4:
cv2.drawContours(image, [approx], -1, (0, 255, 0), 4)
total += 1

This code block is where all the magic happens. For each of the contours we compute the perimeter using cv2.arcLength and then approximate the contour using cv2.approxPolyDP.

The reason we approximate the contour is because the outline may not be a perfect rectangle. Due to noise when the photo was captured or shadows in the image, it is possible (and even very likely) that the book will not have exactly four vertices. By approximating the contour we can ensure we are able to side-step this problem.

Lastly, we make a check to see if the approximated contour does indeed have four vertices. If it does, then we draw the contour surrounding the book and then increment the total number of books counter.

We’ll wrap this example up by writing the total number of books found to the terminal and displaying the output image:

# display the output
print "I found {0} books in that image".format(total)
cv2.imshow("Output", image)
cv2.waitKey(0)

At this point our output image should look like this:

And our terminal does indeed show that we have successfully found the four books in the image will ignoring the other distractor objects:

To execute the script yourself, open up a terminal and execute the following commnd:

$ python find_books.py

Summary

In this blog post you learned how to find books in images using simple image processing and computer vision techniques with Python and OpenCV.

In review, our approach was to:

Load the image from disk and convert it to grayscale.
Blur the image slightly.
Apply the Canny edge detector to detect edges (i.e. outlines) of the objects in the image.
Apply a closing morphological operation to close any gaps in the outlines.
Find the contours of the objects in the image.
Apply contour approximation to determine if the contour was a rectangle, and thus a book.

And that’s all there is to it!

I hope you enjoyed this blog post! And a big thanks to Yasoob for giving me this opportunity! If you’re interested in learning more about myself and computer vision, check out the PyImageSearch blog.

What are the next steps?

We are only scratching the surface of what we can do with computer vision and image processing. Finding contours is just the start.

If you’re interested in learning how to detect faces in images, track objects in video, or handwriting recognition, take a look at my book, Practical Python and OpenCV. Yasoob has written a review about one of my courses as well.

Downloads:

To download the source code and example images used in this article, use this link.

Guys I hope that you enjoyed this interesting intro to image processing in Python and OpenCV. I am sure that there would be more posts like this in future. If you have any comments and feedback do comment below.

See you till next time.

March 11, 2015 05:30 PM

Mike Driscoll

Python 101 50% Off

For the rest of March, you can get my book, Python 101 for 50% off if you use the following code: march15

Learn how to program with Python from beginning to end. My book is made primarily for beginners. However, at least two-thirds of it is aimed at intermediate programmers. You may be wondering how that works. The book will be split into four parts.

Part One

The first part is the beginner section. In it you will learn all the basics of Python. From Python types (strings, lists, dictionaries) to conditional statements to loops. You will also learn about comprehensions, functions and classes and everything in between! Note: This section has been completed and is in the editing phase.

Part Two

This section will be a curated tour of the Python Standard Library. The intent isn’t to cover everything in it, but instead it is to show the reader that you can do a lot with Python right out of the box. We’ll be covering the modules I find the most useful in day-to-day programming tasks, such as os, sys, logging, threads, and more.

Part Three

This section is all intermediate level material. It covers the following:

lambda
decorators
properties
debugging
testing
profiling

Part Four

Now things get really interesting! In part four, we will be learning how to install 3rd party libraries (i.e. packages) from the Python Package Index and other locations. We will cover easy_install and pip. This section will also be a series of tutorials where you will learn how to use the packages you download. For example, you will learn how to download a file, parse XML, use an Object Relational Mapper to work with a database, etc.

Part Five

The last section of the book will cover how to share your code with your friends and the world! You will learn how to package it up and share it on the Python Package Index (i.e. how to create an egg or wheel). You will also learn how to create executables using py2exe, bb_freeze, cx_freeze and PyInstaller. Finally you will learn how to create an installer using Inno Setup.

March 11, 2015 05:15 PM

Europython

EuroPython 2015: Early-Bird tickets sold out!

We are very happy to announce that early-bird tickets are sold out!

The tickets were sold in less than a week!

We’d like to thank everyone for the fantastic feedback. Given the rush to the early-bird tickets (we sold 100 tickets in the first 4 hours), we recommend to not wait too long before getting your standard ticket. It is likely, we’ll sell out early again this year.

As announced we had temporarily closed the registration for a short while today and have now reopened it with the standard rate prices.

Enjoy,
—
EuroPython 2015 Team

March 11, 2015 04:34 PM

CubicWeb

CubicWeb Roadmap meeting on March 5th 2015

The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in January 2015.

Christophe de Vienne (Unlish) and Aurélien Campéas (self-employed) joined us.

Christophe de Vienne asked for discussions on:

Security Context: settle on an approach, and make it happen.
Pyramid Cubicweb adoption: where are we? what authentication stack do we want by default?
Package layout (aka "develop mode" friendliness): let's get real
Documentation: is the restructuration attempt (https://www.cubicweb.org/ticket/4832808) a credible path for the documentation?

Aurélien Campéas asked for discussions on:

status of integration in the 3.21 branch
a new API for cubicweb stores

Sylvain Thénault asked for discussions on:

a new API for dataimport (including cubicweb stores, but not only),
new integrators on CW

Versions

Cubicweb

Version 3.18

This version is stable but old and maintained (current is 3.18.8).

Version 3.19

This version is stable and maintained (current is 3.19.9).

Version 3.20

This version is now stable and maintained (current is 3.20.4).

Version 3.21

See below

Cubes

cubicweb-bootstrap 0.6.6, 1.0.0, 1.0.1, 1.0.2
cubicweb-ckanpublish 0.1.0, 0.2.0
cubicweb-comment 1.11.0
cubicweb-company 0.6.1
cubicweb-forge 1.11.0
cubicweb-forgotpwd 0.6.2
cubicweb-iprogress 0.2.0
cubicweb-mandrill 0.4.0
cubicweb-nosylist 0.6.0
cubicweb-preview 1.1.1
cubicweb-registration 0.6.1
cubicweb-slickgrid 1.0.0
cubicweb-squareui 0.3.8, 1.0.0
cubicweb-testcard 0.5.0
cubicweb-tracker 1.16.2, 1.16.3
cubicweb-trackervcs 1.3.0
cubicweb-trustedauth 0.3.1
cubicweb-varnish 0.3.0
cubicweb-vcreview 2.1.0, 2.1.1
cubicweb-vcsfile 2.0.1, 2.0.2, 2.0.3
cubicweb-worker 3.0.5
cubicweb-wsme 0.1.6
narval 4.1.2
pyramid-cubicweb 0.2.0, 0.2.1

Agenda

Next roadmap meeting will be held at the beginning of may 2015 at Logilab. Interested parties are invited to get in touch.

Open Discussions

New integrators

Rémi Cardona (rcardona) and Denis Laxaldle (dlaxalde) have now the publish access level on Cubicweb repositories.

Security context

Christophe exposed his proposal for a "security context" in Cubicweb, as exposed in https://lists.cubicweb.org/pipermail/cubicweb/2015-February/002278.html and https://lists.cubicweb.org/pipermail/cubicweb/2015-February/002297.html with a proposition of implementation (see https://www.cubicweb.org/ticket/4919855 )

The idea has been validated based on a substitution variables, which names will start with "ctx:" (the RQL grammar will have to be modified to accept a ":")

This will then allow to write RQL queries like (API still to be tuned):

X owned_by U, U eid %(ctx:cwuser_eid)s

Pyramid

The pyramid-based web server proposed by Christophe and used for its unlish website is still under test and evaluation at Logilab. There are missing features (implemented in cubes) required to be able to deploy pyramid-cubicweb for most of the applications used at Logilab, especially cubicweb-signedrequest

In order to make it possible to implement authentication cubes like cubicweb-signedrequest, the pyramid-cubicweb requires some modifications. These has been developped and are about to be published, along with a new version of signedrequest that provide pyramid compatibility.

There are still some dependencies that lack a proper Debian package, but that should be done in the next few weeks.

In order to properly identify pyramid-related code in a cube, it has been proposed that these code should go in modules in the cube named pviews and pconfig (note that most cube won't require any pyramid specific code). The includeme function should however be in the cube's main packgage (in the __init__.py file)

There have been some discussions about the fact that, for now, a pyramid-cubicweb instance requires an anonymous user/access, which can also be a problem for some application.

Layout

Christophe pointed the fact that the directory/files layout of cubicweb and cubes do not follow current Python's de facto standards, which makes cubicweb hard to use in a context of virtualenv/pip based installation. There is the CWEP004 discussing some aspects of this problem.

The decision has been taken to move toward a Cubicweb ecosystem that is more pip-friendly. This will be done step by step, starting with the dependencies (packages currently living in the logilab "namespace").

Then we will investigate the feasibility of migrating the layout of Cubicweb itself.

Documentation

The new documentation structure has been approved.

It has been proposed (and more or less accepted) to extract the documentation in a dedicated project. This is not a priority, however.

Roadmap for 3.21

No change since last meeting:

the complete removal of the dbapi, the merging of Connection and ClientConnection. remains
Integrate the pyramid cube to provide the pyramid command if the pyramid framework can be imported: removed (too soon, pyramid-cubicweb's APIs are not stable enough)
Integration of CWEP-003 (FROM clause for RQL): removed (will probably never be included unless someone needs it)
CWEP-004 (cubes as standard python packages) is being discussed: removed (not for 3.21, see above)

dataimports et stores

A heavy refactoring is under way that concerns data import in CubicWeb. The main goal is to design a single API to be used by the various cubes that accelerate the insertion of data (dataio, massiveimport, fastimport, etc) as well as the internal CWSource and its data feeds.

For details, see the thread on the mailing-list and the patches arriving in the review pipeline.

March 11, 2015 12:30 PM

PyCon

Signup for Sponsor Tutorials!

Our Sponsor Tutorial schedule has come together and we've opened registration on Eventbrite! Running Wednesday and Thursday April 8-9, these free tutorials are offered by several of our generous sponsors. While registration for these tutorials is not required, it helps us plan for food and room size.

Check out the schedule at https://us.pycon.org/2015/schedule/sponsor-tutorials/. Each tutorial is 1.5 hours, and free!

We kick off Wednesday with David Gouldin of Heroku walking through building and deploying applications on Heroku. After lunch, Eric Feng of Dropbox introduces the Dropbox API and will take attendees through authentication to reading and writing files. There are two other open slots on the Wednesday schedule, and we'll update this post once those are known.

Thursday's schedule begins with Steve Downer and Chris Wilcox showing off how to build a Django app on the Microsoft Azure cloud. The folks at Code Climate are going to be talking about a number of important development topics, including how to be provide quality code review and build an effective pull request based workflow. Kyle Kelly of Rackspace will be discussing cloudpipe and showing attendees how to contribute to it. Wrapping up the Thursday schedule is Google, who will be hosting a trio of yet-to-be-announced talks during their time slot.

If you're interested in our instructor-led tutorials, spaces are still open in many of them, but keep in mind that those are likely sell out. The tutorial schedule is available here, and you can register for $150 per tutorial here.

March 11, 2015 10:50 AM

PyCon 2015 - Explore Montreal

Explore Montréal

What	When
Free guided tour of Old Montréal	Friday April 10h, 10:30am - 1pm
Free guided tour of Plateau Mont-Royal	Saturday April 11th, 10:30am - 1pm
Your own discovery of Montreal using Duckling	Sunday April 12th, 10:30am - 1pm

Guided Tour of Old Montréal (Friday)

The conference venue is adjacent to Old Montréal, the historical part of the city. This tour will take you through narrow cobblestone streets lined with buildings that date as far back as the 1600s. We'll pass by many souvenir shops, galleries, and restaurants, as well as take in some of Montreal's landmarks. Since we're not too far from the conference centre, grabbing lunch at one of the restaurants along Rue St-Paul before heading back is a tasty possibility! You'll be responsible for paying for your own lunch.

Eventbrite - PyCon 2015: Explore Montréal

Guided Tour of Plateau Mont-Royal (Saturday)

One of the most well known neighbourhoods of Montreal, the Plateau Mont-Royal is characterized by brightly coloured houses, cafés, book shops, and a laissez-faire attitude. It's the location of some famous attractions on Saint Laurent Boulevard, including Schwartz's Deli (famous for its Montreal smoked meat), and a weekend street fair during the summer that sees extremely crowded streets. In 1997, Utne Reader rated it one of the 15 "hippest" neighbourhoods in North America. Note: This tour will required 2 metro tickets ($6.00). If you can, please buy them before the tour to avoid line ups.

Duckling Outings

Within a 10 minute walk, a world of choice awaits you. To find out where others are going and to join them, use Duckling, brought to you by Caktus Group.

Caktus made Duckling because of how much we love the impromptu outings at PyCon. We wanted to build an app that helps us focus on fun, not logistics. Duckling makes it easy for you to find and join outings during PyCon or to create your own.

Happy Exploring!

March 11, 2015 10:25 AM

Vasudev Ram

ASCII Table to PDF with xtopdf

By Vasudev Ram

Recently, I had the need for an ASCII table lookup, which I searched for and found, thanks to the folks here:

www.ascii-code.com

That gave me the idea of writing a simple program to generate an ASCII table in PDF. Here is the code for a part of that table - the first 32 (0 to 31) ASCII characters, which are the control characters:

# ASCIITableToPDF.py
# Author: Vasudev Ram - http://www.dancingbison.com
# Demo program to show how to generate an ASCII table as PDF,
# using the xtopdf toolkit for PDF creation from Python.
# Generates a PDF file with information about the 
# first 32 ASCII codes, i.e. the control characters.
# Based on the ASCII Code table at http://www.ascii-code.com/

import sys
from PDFWriter import PDFWriter

# Define the header information.
column_names = ['DEC', 'OCT', 'HEX', 'BIN', 'Symbol', 'Description']
column_widths = [4, 6, 4, 10, 7, 20]

# Define the ASCII control character information.
ascii_control_characters = \
"""
0    000    00    00000000    NUL    �         Null char
1    001    01    00000001    SOH             Start of Heading
2    002    02    00000010    STX             Start of Text
3    003    03    00000011    ETX             End of Text
4    004    04    00000100    EOT             End of Transmission
5    005    05    00000101    ENQ             Enquiry
6    006    06    00000110    ACK             Acknowledgment
7    007    07    00000111    BEL             Bell
8    010    08    00001000    BS             Back Space
9    011    09    00001001    HT    	         Horizontal Tab
10    012    0A    00001010    LF    
         Line Feed
11    013    0B    00001011    VT             Vertical Tab
12    014    0C    00001100    FF             Form Feed
13    015    0D    00001101    CR    
         Carriage Return
14    016    0E    00001110    SO             Shift Out / X-On
15    017    0F    00001111    SI             Shift In / X-Off
16    020    10    00010000    DLE             Data Line Escape
17    021    11    00010001    DC1             Device Control 1 (oft. XON)
18    022    12    00010010    DC2             Device Control 2
19    023    13    00010011    DC3             Device Control 3 (oft. XOFF)
20    024    14    00010100    DC4             Device Control 4
21    025    15    00010101    NAK             Negative Acknowledgement
22    026    16    00010110    SYN             Synchronous Idle
23    027    17    00010111    ETB             End of Transmit Block
24    030    18    00011000    CAN             Cancel
25    031    19    00011001    EM             End of Medium
26    032    1A    00011010    SUB             Substitute
27    033    1B    00011011    ESC             Escape
28    034    1C    00011100    FS             File Separator
29    035    1D    00011101    GS             Group Separator
30    036    1E    00011110    RS             Record Separator
31    037    1F    00011111    US             Unit Separator
"""

# Create and set some of the fields of a PDFWriter instance.
pw = PDFWriter("ASCII-Table.pdf")
pw.setFont("Courier", 12)
pw.setHeader("ASCII Control Characters - 0 to 31")
pw.setFooter("Generated by xtopdf: http://slid.es/vasudevram/xtopdf")

# Write the column headings to the output.
column_headings = [ str(val).ljust(column_widths[idx]) \
    for idx, val in enumerate(column_names) ]
pw.writeLine(' '.join(column_headings))

# Split the string into lines, omitting the first and last empty lines.
for line in ascii_control_characters.split('\n')[1:-1]:

    # Split the line into space-delimited fields.
    lis = line.split()

    # Join the words of the Description back into one field, 
    # since it was split due to having internal spaces.
    lis2 = lis[0:5] + [' '.join(lis[6:])]

    # Write the column data to the output.
    lis3 = [ str(val).ljust(column_widths[idx]) \
        for idx, val in enumerate(lis2) ]
    pw.writeLine(' '.join(lis3))

pw.close()

Discerning readers will notice the effect of some of the said control characters on the displayed program code :)

You can run the program with:

python ASCIITableToPDF.py

py ASCIITableToPDF.py

./ASCIITableToPDF.py

ASCIITableToPDF.py

Figuring what needs to be done for each of the above methods of invocation to work, is (as they say in computer textbooks), left as an exercise for the reader :)

(There is a bit of subtlety involved, at least for beginners, particularly if you want to do it on both Linux and Windows.)

And here is a screenshot of the PDF output of the program:

- Vasudev Ram - Online Python training and programming

Dancing Bison Enterprises

Signup to hear about new products or services from me.

Posts about Python Posts about xtopdf

Contact Page

Share |

March 11, 2015 01:43 AM

Bloomberg LP	yes	no	abstain
Fastly	yes	no	abstain
Infinite Code	yes	no	abstain

Sublicense entities	Only YouTube (others embedding)	As many mirrors as possible	Only non-commercial mirrors
Sublicense timeframe	Prospectively only	Including retroactively	Not applicable

Planet Python

March 13, 2015

Simulators: Designing Hardware with Software

Instruction-Set Simulators Must be Fast and Productive

Why RPython?

The Pydgin embedded-ADL

Pydgin Performance

Conclusions and Future Work

Acknowledgements

Footnotes

References

March 12, 2015

Talks/Trainings in Spanish and Basque

Talk voting

Introduction

News

Driver Compatibility:

MS SQL Server

SAP Sybase ASE

Misc:

Features

mxODBC 3.3 Release Highlights

Stored Procedures

User Customizable Row Objects

Fast Cursor Types

mxODBC 3.3 Driver Compatibility Enhancements

Oracle

MS SQL Server

Sybase ASE

IBM DB2

PostgreSQL

MySQL

Editions

Downloads

Upgrading

More Information

March 11, 2015

Summary of previous episodes

Round 4: 4000 requests/s with wrk2

Array results

Round 5: 50 simultaneous connections with wrk

(Bonus) Round 6: 10 requests/s with wrk2 during 30 seconds

Conclusion

What are we going to do?

What libraries will we need?

Finding books in images using Python and OpenCV.

Summary

What are the next steps?

Downloads:

Part One

Part Two

Part Three

Part Four

Part Five

The tickets were sold in less than a week!

Explore Montréal

Sponsored by Caktus Group

Guided Tour of Old Montréal (Friday)

Guided Tour of Plateau Mont-Royal (Saturday)

Duckling Outings

Happy Exploring!