POSH Python Object Sharing

  • Published on
    25-Feb-2016

  • View
    53

  • Download
    1

DESCRIPTION

POSH Python Object Sharing. Steffen Viken Valvg In collaboration with Kjetil Jacobsen & ge Kvalnes. University of Troms, Norway Sponsored by Fast Search & Transfer. Test.py. for x in "TEST PROGRAM": if x not in "FORGET IT": print x,. Test.pyc. 0 SETUP_LOOP - PowerPoint PPT Presentation

Transcript

  • POSHPython Object SharingSteffen Viken ValvgIn collaboration withKjetil Jacobsen &ge KvalnesUniversity of Troms, NorwaySponsored byFast Search & Transfer

  • Python Execution Model

  • Python Threading ModelThread ABytecodesThread BEach thread executes a separate sequence of byte codesAll threads must contend for one global interpreter lock

  • Example: Matrix MultiplicationPerforms a matrix multiplication A = B x CThe work is split between several worker threadsThe application runs on a machine with 8 CPUsThreads do not scale for multiple CPUs due to lock contention on the GIL

  • Workaround: ProcessesProcess AProcess BIPCEach process has its own interpreter lockRequires inter-process communication, using e.g. message passing by means of pipes

  • Matrix Multiplication using ProcessesA master process distributes the input matrices to a set of worker processesEach worker process computes some part of the output matrix, and returns its result to the masterThe master process assembles the final result matrixMore communication, and more complex pattern than using threads

  • Ways AheadCommunication through standard Python container objects favors threadsScalability on multiprocessor architectures favors processesThe GIL is here to stay, so making threads scale better is hardHowever, there might be room for improvement of inter-process communication mechanisms

  • Using Shared Memory for IPCProcess BShared Memory

    Process AProcesses communicate by accessing a shared memory regionRequires explicit synchronization and data marshalling, imposes a flat data structure

  • Using POSH for IPCProcess BShared Memory

    Process AAllocates regular Python objects in shared memoryShared objects are accessed transparently through regular method callsIPC is done by modifying shared, mutable objects

  • ComplicationsProcesses must synchronize their access to shared, mutable objects (just like threads)Explicit synchronization of critical regions must be possible, while implicit synchronization upon accessing shared objects is desireablePythons regular garbage collection algorithm is inadequate for shared objects, which may be referenced by multiple processes

  • Proxy ObjectsShared ObjectProxy ObjectXProvides transparent access to a shared object by forwarding all attribute accesses and method callsProvides a single entry point to a shared object, where synchronization policies may be enforced X.method1()return value

  • Multi-Process Garbage CollectionMust account for references from all live processesMust stay up-to-date when processes fork, as this may create new references to shared objectsShould be able to handle abnormal process termination without leaking shared objects

  • Garbage Collection in POSHProcess A

    Shared Memory

    Process B

    XYLShared ObjectMRegular Python referenceReference from a process to a shared object (type I)Reference from one shared object to another (type II)

  • Garbage Collection DetailsPOSH creates at most one proxy object per process for any given shared objectShared objects are always referenced through their proxy objectsA bitmap in each shared object records the processes that have a corresponding proxy object. This tracks references of type I (from a process) A separate count in each shared object records the number of references to the object from other shared objects. This tracks references of type IIShared objects are deleted when there are no references to them of either type

  • PerformancePerforming a matrix multiplication A = B x C using POSHThe work is split between several worker processesThe application runs on a machine with 8 CPUsMore overhead, but scales for multiple CPUs

  • SummaryPython uses a global interpreter lock (GIL) to serialize execution of byte codesThis entails a lack of scalability on multiprocessor architectures for CPU-intensive multi-threaded appsHowever, threads offer an attractive programming model, with implicit communicationProcesses + shared memory reduce IPC overheads, but normally impose flat data structures and require data marshallingPOSH uses processes + shared memory to offer a programming model similar to threads, with the scalability of processes

  • AvailabilityOpen source, hosted at SourceForgehttp://poshmodule.sf.net/Still not very stableDevelopers wanted

  • Example Usageimport posh

    class Stuff(object): pass

    posh.allow_sharing(Stuff, posh.generic_init)

    mystuff = posh.share(Stuff())

    def worker1(): mystuff.money = 0def worker2(): mystuff.debt = 100000def worker3(): mystuff.balance = mystuff.money - mystuff.debt

    for w in worker1, worker2, worker3: posh.forkcall(w)posh.waitall()

Recommended

View more >