Common Misconceptions about Enterprise Python

Python is presently among the most popular vibrant programming languages, in addition to Perl, Tcl, PHP, and newbie Ruby. Although it is often considereded a “scripting” language, it is truly a basic function programming language along the lines of Lisp or Smalltalk (as are the others, by the way). Today, Python is made use of for everything from throw-away scripts to huge scalable web servers that offer uninterrupted service 24×7. It is used for GUI and database programming, client- and server-side web programming, and application testing. It is utilized by scientists composing applications for the world’s fastest supercomputers and by children initially learning how to program.

Today, Python powers over 50 jobs, including:

  • Features and items, such as eBay Now and RedLaser
  • Operations and infrastructure, both OpenStack and proprietary
  • Mid-tier services and applications, like the one made use of to set PayPal’s rates and examine customer feature eligibility.
  • Monitoring representatives and interfaces, made use of for a number of deployment and security use cases.
  • Batch tasks for data import, price change, and more
  • And far too many developer tools to count

 

Python is a new language

What with all the start-ups using it and children discovering it these days, it’s simple to see how this misconception still continues. Python is actually over 23 years of ages, originally launched in 1991, 5-years before HTTP 1.0 and 4-years before Java. A now-famous early usage of Python was in 1996: Google’s first effective web crawler.

If you’re curious about the long history of Python, Guido van Rossum, Python’s creator, has actually taken the care to inform the whole story.

Python is not compiled

While not requiring a separate compiler toolchain like C++, Python is in fact put together to bytecode, much like Java and numerous other put together languages. More collection steps, if any, are at the discretion of the runtime, be it CPython, PyPy, Jython/JVM, IronPython/CLR, or some other process virtual device.

The basic concept at PayPal and somewhere else is that the compilation status of code must not be counted on for security. It is much more crucial to secure the runtime environment, as practically every language has a decompiler, or can be intercepted to dump protected state. See the next misconception for even more Python security ramifications.

Python is not protect

Python’s affinity for the lightweight may not make it seem formidable, but the intuition here can be misleading. One main tenet of security is to provide as small a target as possible. Big systems are anti-secure, as they tend to extremely centralize behaviors, along with undercut developer comprehension. Python keeps these demons at bay by encouraging simpleness. Moreover, CPython addresses these issues by being a basic, stable, and easily-auditable virtual machine. In fact, a recent analysis by Coverity Software resulted in CPython getting their highest quality rating.

Python also features an extensive variety of open-source, industry-standard security libraries. At PayPal, where we take security and trust extremely seriously, we find that a mix of hashlib, PyCrypto, and OpenSSL, through PyOpenSSL and our own custom bindings, cover all of PayPal’s varied security and performance requirements.

For these reasons and more, Python has seen some of its fastest adoption at PayPal (and eBay) within the application security group. Right here are just a few short security-based applications making use of Python for PayPal’s security-first environment:

  • Creating security agents for assisting in crucial rotation and consolidating cryptographic applications.
  • Integrating with industry-leading HSM technologies.
  • Constructing TLS-secured wrapper proxies for less-compliant stacks.
  • Generating keys and certificates for our internal mutual-authentication schemes.
  • Establishing active vulnerability scanners

 

Plus, myriad Python-built operations-oriented systems with security ramifications, such as firewall and connection management. In the future we’ll certainly aim to assemble a deep dive on PayPal Python security details.

Python is a scripting language

Python can undoubtedly be made use of for scripting, and is one of the forerunners of the domain due to its easy syntax, cross-platform support, and ubiquity among Linux, Macs, and other Unix machines.

In fact, Python may be one of the most flexible technologies among general-use programming languages. To list just a few:

  • Telephone facilities (Twilio)
  • Payments systems (PayPal, Balanced Payments)
  • Neuroscience and psychology (lots of, lots of, examples)
  • Numerical analysis and engineering (numpy, numba, and many more)
  • Animation (LucasArts, Disney, Dreamworks)
  • Video gaming backends (Eve Online, Second Life, Battlefield, and so numerous others)
  • Email infrastructure (Mailman, Mailgun)
  • Media storage and processing (YouTube, Instagram, Dropbox)
  • Operations and systems management (Rackspace, OpenStack)
  • Natural language processing (NLTK)
  • Machine learning and computer vision (scikit-learn, Orange, SimpleCV)
  • Security and penetration testing (so many and eBay/PayPal)
  • Big Data (Disco, Hadoop support)
  • Calendaring (Calendar Server, which powers Apple iCal)
  • Search systems (ITA, Ultraseek, and Google)
  • Internet infrastructure (DNS) (BIND 10)

 

Not to mention web sites and web services aplenty. In fact, PayPal engineers appear to have a fondness for going on to begin Python-based web buildings, YouTube and Yelp, for example. For an even bigger list of Python success stories, check out the main list.

Python is weakly-typed

Python’s type system is characterized by strong, dynamic typing. Wikipedia can discuss more. Not that it is a competition, but as an enjoyable truth, Python is more strongly-typed than Java. Java has a split type system for primitives and objects, with null lying in a sort of gray area. On the other hand, contemporary Python has actually a combined strong type system, where the kind of None is well-specified. In addition, the JVM itself is also dynamically-typed, as it traces its roots back to an execution of a Smalltalk VM acquired by Sun.

Python’s type system is very nice, but for enterprise use there are much bigger concerns at hand.

Python is slow

Initially, a critical distinction: Python is a programming language, not a runtime. There are several Python executions:.

  • CPython is the reference implementation, and also the most extensively distributed and used.
  • Jython is a fully grown execution of Python for use with the JVM.
  • IronPython is Microsoft’s Python for the Common Language Runtime, aka.NET.
  • PyPy is an up-and-coming implementation of Python, with sophisticated features such as JIT compilation, incremental garbage collection, and more.

 

Each runtime has its own performance characteristics, and none of them are slow per se. The more vital point here is that it is an error to designate performance evaluations to a programming languages. Always evaluate an application runtime, the majority of ideally versus a particular use case.

Having cleared that up, here is a small option of cases where Python has offered significant performance advantages:.

  1. Using NumPy as an interface to Intel’s MKL SIMD.
  2. PyPy’s JIT compilation attains faster-than-C performance.
  3. Disqus scales from 250 to 500 million users on the very same 100 boxes.

 

Admittedly these are not the latest examples, just my favorites. It would be easy to get side-tracked into the wide world of high-performance Python and the unique providings of runtimes. Instead of dealing with individual special cases, attention should be drawn to the generalizable effect of developer performance on end-product performance, particularly in an enterprise setting.

Provided enough time, a disciplined developer can perform the only tested approach to accomplishing accurate and performant software:.

  1. Engineer for appropriate behavior, including the development of particular tests.
  2. Profile and measure performance, determining bottlenecks.
  3. Optimize, paying appropriate regard to the test suite and Amdahl’s Law, and taking advantage of Python’s strong roots in C.

 

It might sound simple, but even for experienced engineers, this can be a very lengthy process. Python was created from the ground up with developer timelines in mind. In our experience, it’s not unusual for Python projects to go through three or more versions in the time it C++ and Java to do simply one. Today, PayPal and eBay have actually seen numerous success stories wherein Python jobs exceeded their C++ and Java equivalents, with less code (see right), all thanks to quick development times enabling careful customizing and optimization. You understand, the enjoyable stuff.

Python does not scale

Scale has numerous meanings, however by any definition, YouTube is a web site at scale. More than 1 billion unique visitors each month, over 100 hours of uploaded video per minute, and going on 20 pecent of peak Internet bandwidth, all with Python as a core technology. Dropbox, Disqus, Eventbrite, Reddit, Twilio, Instagram, Yelp, EVE Online, Second Life, and, yes, eBay and PayPal all have Python scaling stories that show scale is more than just possible: it’s a pattern.

The secret to success is simpleness and consistency. CPython, the main Python virtual device, makes the most of these characteristics, which in turn makes for a very foreseeable runtime. One would be hard pressed to find Python programmers concerned about garbage collection stops briefly or application startup time. With strong platform and networking support, Python naturally lends itself to smart horizontal scalability, as materialized in systems like BitTorrent.

Furthermore, scaling is all about measurement and version. Python is constructed with profiling and optimization in mind.

Python lacks good concurrency support

Periodically debunking performance and scaling misconceptions, and somebody attempts to get technical, “Python does not have concurrency,” or, “What about the GIL?” If dozens of counterexamples are insufficient to boost one’s confidence in Python’s capability to scale vertically and horizontally, then a prolonged description of a CPython execution information most likely won’t assist, so I’ll keep it quick.

Python has terrific concurrency primitives, consisting of generators, greenlets, Deferreds, and futures. Python has fantastic concurrency frameworks, including eventlet, gevent, and Twisted. Python has actually had some fantastic work put into customizing runtimes for concurrency, consisting of Stackless and PyPy. All of these and more show that there is no shortage of engineers successfully and unapologetically using Python for simultaneous programming. Also, all of these are formally support and/or used in enterprise-level production environments.

The Global Interpreter Lock, or GIL, is a performance optimization for the majority of use cases of Python, and a development ease optimization for virtually all CPython code. The GIL makes it a lot easier to use OS threads or green threads (greenlets normally), and does not influence using several processes. To find out more, see this excellent Q&A on the subject and this summary from the Python docs.

Here at PayPal, a common service deployment requires numerous machines, with several procedures, several threads, and a huge number of greenlets, amounting to a really robust and scalable simultaneous environment (see figure below). In a lot of enterprise environments, celebrations has the tendency to choose a relatively high degree of overprovisioning, for basic vigilance and catastrophe recovery. Nevertheless, in some cases Python services still see millions of demands per machine per day, handled with ease.

Python programmers are limited

There are not as lots of Python web developers as PHP or Java web developers. This is most likely primarily due to a combined interaction of market demand and education, though trends in education suggest that this may change.

That said, Python developers are far from limited. There are millions worldwide, as provened by the lots of Python conferences, tens of countless StackOverflow questions, and companies like YouTube, Bank of America, and LucasArts/Dreamworks utilizing Python developers by the hundreds and thousands. At eBay and PayPal we have hundreds of developers who use Python regularly, so what’s the technique?

Well, why scavenge when one can produce? Python is incredibly easy to learn, and is a first programming language for children, college student, and professionals alike. At eBay, it only takes one week to reveal genuine outcomes for a new Python programmer, and they typically truly start to shine as quickly as 2-3 months, all made possible by the Internet’s rich cache of interactive tutorials, books, documentation, and open-source codebases. Another crucial factor to think about is that tasks using Python merely do not need as numerous developers as other projects.

Python is not for big jobs

While Instagram reached numerous millions of hits a day at the time of their billion dollar acquisition, the entire company was still only a group of a lots or so people. Dropbox in 2011 just had 70 engineers, and other groups were similarly lean. So, can Python scale to big groups?

Bank of America in fact has over 5,000 Python developers, with over 10 million lines of Python in one project alone. JP Morgan went through a comparable improvement. YouTube also has engineers in the thousands and lines of code in the millions. Big items and big teams use Python every day, and while it has exceptional modularity and packaging characteristics, beyond a specific point much of the general development scaling guidance stays the exact same. Tooling, strong conventions, and code review are what make big jobs a manageable reality.

Thankfully, Python starts with an excellent baseline on those fronts also. We use PyFlakes and other tools to perform fixed analysis of Python code before it gets checked in, along with adhering to PEP8, Python’s language-wide base style guide.

Lastly, it should be kept in mind that, in addition to the scheduling speedups, projects using Python normally require fewer developers, also. Our most common success story begins with a Java or C++ project slated to take a team of 3-5 developers someplace between 2-6 months, and ends with a single motivated developer finishing the project in 2-6 weeks (or hours, for that matter).

A miracle for some, however a reality of contemporary development, and often a requirement for a competitive business.

Summary

Folklore can be an enjoyable pastime. Discussions around these misconceptions stay a few of the most active and instructional, both internally and externally. Also, remember that the appearance of these seemingly tiresome and troublesome issues is a sign of steadily growing interest, and with constant increase of interested celebrations comes the consistent task of education. Here’s hoping that this post handles to extinguish a flame war and allow a project or two to discuss the real work that can be attained with Python.

 

Source: PayPal Engineering

Post a comment

Your email address will not be published. Required fields are marked *