[Fiware-miwi] xflow + WebCL

Philipp Slusallek Philipp.Slusallek at dfki.de
Mon Sep 16 07:59:42 CEST 2013
Previous message: [Fiware-miwi] xflow + WebCL
Next message: [Fiware-miwi] xflow + WebCL
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,

Sorry, was gone over the weekend.

Am 15.09.2013 10:14, schrieb Jarkko Vatjus-Anttila:
> I am just thinking aloud here, and may be referencing to points which
> you already marked down in you email, but I still want to raise two
> questions which came up while browsing the code.
> 
> 1) In order to be fully parallelised, Xflow needs to aware the submitted
> work packages may be executed in some other context than the main thread
> javascript is running. Typically the case would be that the computation
> is executed in the GPU context. Hence, when executing a flow of nodes,
> the main javascript must not block, but instead do other tasks while
> waiting for the results to arrive (which in some cases might take a
> while). I call this a concurrency problem. It seems to be that the
> current implementation does not conceptually support this kind of
> execution, and hence would require rework to the Xflow core.

Processing in multiple threads in JS is one issue that we need to look
out for. I am not how this is implemented yet. The main issue is that we
cannot schedule work from other threads. So the only option is to use
asynchronoous calls and keep doing other things in the mean time.

> 2) The other problem I am calling a parallelism problem. If we have
> primitive nodes implemented in Xflow, doing primitive operations, it is
> likely that we will build chains of these operators to gain something
> more valuable. If all of the chained primitives are computed at the GPU
> (or some other remote context for that matter), it would require that
> none of the intermediate results of the computation need to be
> transferred between the GPU and main thread. Achieving this it would
> require that WebCL nodes can pass contextual information to the next
> node (i.e. shared GPU textures) to avoid unneeded data copying. This is
> also a feature which seems non-existent to me.

You are absolutely correct. As long as we have to read back the buffers
before scheduling the next Xflow operator, things cannot be as fast as
one would like and we have to operate synchronously (between CPU and GPU).

As soon as we have buffers that stay on the "GPU", its possible to
schedule them all at once and only synchronize at the end (either with
the rendering via a semaphore or with the rest of Xflow if the data
needs some further CPU processing.

Xflow was designed with the notion that we have a graph that we can
analyse before executing it. So this should be possible without major
changes to the Xflow core. But we definitely need to add the notion of
having different types of buffers (GPU/CPU) and do split scheduling for
both (sub-) graphs.

Also note that the "GPU" often really is on the same die as the CPU
today, such that data transfer is not nearly as expensive in terms of
time as it once was, particularly for mobile platforms. But minimizing
transfers is still important as it might require lots of power
(especially on mobile plattforms).

The ultimate goal will be to merge these operators such that less data
needs to be stored in memory between them. This will only meaningfully
possible, once our "shading" compiler stuff is ready and can be applied
to this case as well -- which is my ultimate goal.


But until then we need to come up with good intermediate solution that
allow us to have good demos! I consider glsl the best option for this at
the moment as it would work in any browser and also on mobiles.

While WebCL might be the better option eventually, we could taget it
through the compiler and rather focus on the short term solutions right now.

What do you think?


Best,

	Philipp

> Now raising these two problems, might fundamentally be exactly the ones
> what you already described, Felix, but like I said: I wanted to raise
> this up a little bit. What is your take in general to this?
> 
> - jarkko
> 
> 
> On Tue, Sep 10, 2013 at 11:53 AM, Felix Klein
> <lachsen at cg.uni-saarland.de <mailto:lachsen at cg.uni-saarland.de>> wrote:
> 
>     Hi Jouni,
> 
>     sorry for answering so late. I was pretty busy the recent days and
>     unfortunately won't have much time to help you with this topic at
>     the moment.
> 
>     About the integration of parallel computing into Xflow:
> 
>     *What Xflow can do at the moment:*
>     For the xml3d.js implementation, Xflow currently still runs only on
>     the CPU using JavaScript. However, we're almost done with the
>     integration into the vertex shader stage with GLSL. However, any
>     other GLSL computation is not supported at the moment. 
> 
>     *Isolated WebCL/GLSL operators:*
>     The first point that Kristian/Philipp mentioned is relatively easy
>     to implement, but has limited efficiency (unless you have large,
>     complex operators).
>     The idea here is, that you simply create an Xflow operator that has
>     input and output CPU buffers (e.g. TypedArrays), converts the input
>     data to the respective computation platform, performs the
>     computation, and finally converts the data back.
> 
>     There isn't really much to explain here except to pointing you to a
>     custom Xflow operator.
>     You can see one
>     here: http://xml3d.github.io/xml3d-examples/examples/xflowWave/xflow-wave.xhtml
>     More precisely, here the custom operator
>     file: http://xml3d.github.io/xml3d-examples/examples/xflowWave/myxflow.js
>     And here some other operators used for simple image processing on
>     the GPU: http://xml3d.github.io/xml3d-examples/script/xflip-operators.js
> 
>     Unfortunately, we're currently missing a proper documentation on how
>     Xflow operators are implemented. I hope you can understand it
>     looking at the code.
>     If something is unclear, just contact me with specific questions.
>     If you want to integrate WebCL or GLSL rendering as described
>     previously, you simply have to do all the stuff necessary in the
>     evaluate() function of an Xflow operator.
>     This includes converting the buffers back and forth.
> 
>     *Communicating WebCL/GLSL buffers:*
>     The idea here is that Xflow supports the transfer GLSL or WebCL
>     specific buffers between hardware.
> 
>     After thinking about this issue in more detail, I realized that it's
>     not quite as simple as I first thought.
> 
>     The following aspects need to be implemented to support this:
>     1. The Xflow system should implement the conversion of buffers in
>     its infrastructure, not inside the operators
>     2. Operators must request which kind of buffer they need
>     3. The buffers data structure (i.e. Xflow.BufferEntry) needs to
>     store values for each platform (WebCL, WebGL)
>     4. Computation in the Xflow graph need to marked for each platform
>     separately.
>     5. In order to have efficient GLSL computation connected to the
>     renderer we need to run operators and store buffers per *context
>     (e.g. per XML3D scene)*
>     *
>     *
>     The whole thing requires several changes in the Xflow interfaces.
>     Thus, it is not quickly explained what kind of code needs to be
>     changed at what point... 
>     I don't really think it's a huge feature, but it must be carefully
>     implemented. If you want to take over this implementation, we should
>     at least communicate changes in the interface together.
> 
>     Thus, doing this second step is a lot harder then the first.
>     Especially since I'm currently quite busy.
> 
>     I suggest we start with the first idea and then discuss the second
>     point in more detail, hopefully when I have more time.
> 
>     Bye
> 
>     Felix
> 
> 
> 
> 
>     On Mon, Sep 9, 2013 at 2:53 PM, Jouni Mietola
>     <jouni.mietola at cyberlightning.com
>     <mailto:jouni.mietola at cyberlightning.com>> wrote:
> 
>         Thank you for the comprehensive info. We have now forked xml3d
>         repository and going to work under webcl branch if you want to
>         follow us here:
>         https://github.com/Cyberlightning/xml3d.js
> 
>         Felix if you are available and have time it would be great help
>         to hear more about the points (1) and (2).
> 
>         Respectfully,
>         Jouni Mietola
>         Software Engineer
>         Cyberlightning Ltd.
> 
> 
> 
>         On Thu, Sep 5, 2013 at 8:37 AM, Philipp Slusallek
>         <Philipp.Slusallek at dfki.de <mailto:Philipp.Slusallek at dfki.de>>
>         wrote:
> 
>             [I am CCing to the MiWi list, as this may be interesting
>             also in a wider
>             architectural context.]
> 
>             Hi Jouni,
> 
>             Welcome in the team! Integration of optimal HW support is a very
>             important aspect of WP13, so your looking into this is very much
>             appreciated.
> 
>             Here are the possible options that we see for implementing
>             support for
>             WebCL:
> 
>             1. You should be able to implement individual kernels
>             directly in Xflow,
>             similar to the already existing support for glsl
>             computations in Xflow
>             nodes. Thanks to the modular design of Xflow this should be
>             a a rather
>             small change to Xflow and would already be a great first
>             step. Felix (in
>             CC) can point you to the right part of Xflow and get you a
>             head start
>             for implementing it.
> 
>             2. Currently Xflow assumes that the results of Xflow nodes
>             are available
>             in JS after each node. This is obviously inefficient, if the
>             next node
>             is also implemented on the GPU as data would be up and
>             downloaded after
>             each node. There is no support for this in Xflow yet but it
>             should not
>             be very hard to implement this in the Xflow core. Again
>             Felix would be
>             the right person to talk to about that.
> 
>             3. As yet another optimization one could even merge the code
>             from two
>             (or more) successive Xflow kernels into one and execute it
>             as a single
>             kernel. We already have implemented this (text-based)
>             merging of code
>             for the composition of vertex shaders. It would be an option
>             to extend
>             this scheme to also apply to WebCL code but it could be a
>             bit more
>             complicated here and programmers would have to follow a certain
>             programming style and format. It might not be necessary to
>             do this as we
>             might have Option 4 available in time before you would get
>             to this.
> 
>             4. Kristian is currently developing a very nice framework
>             for specifying
>             shaders in JS ("shader.js"). It allows to write completely
>             generic
>             shaders in JS and uses a small compiler framework (also in
>             JS) to
>             cross-compile this code to different concrete shading
>             languages. We are
>             developing this to be able to have portable and platform
>             neutral shaders
>             that can be used with any rendering system. We are targeting
>             feed-forward renderers (glsl), deferred renderer (glsl,
>             OpenCL), and
>             real-time ray-tracing renderers (Intel's Embree, for a
>             start). The
>             language features are compatible with those from the OpenSL
>             (Sony) and
>             MDL (Nvidia), they are compatible with (real-time) global
>             illumination
>             algorithms, and should work everywhere. So we should be able
>             to target
>             also high-end rendering with a single shader/material
>             library. We have
>             formed a small initial small consortium with the German
>             industry to
>             promote this idea.
> 
>             This last option is very general and uses a "real" compiler
>             (in JS).
>             With some additional effort it should be possible to extend this
>             compiler to also support computational kernels and generate
>             WebCL (or
>             any other: CUDA, RiverTrail, C/C++ with intrinsics) code.
>             Since the
>             compiler "understands" the code, it should ideally be able
>             to fully
>             merge kernels without any formatting conventions that the
>             programmer has
>             to follow. Because we can do all sort of optimizations at
>             the high level
>             (in addition to the low-level optimizations that are still
>             done by the
>             glsl/WebCL/etc backend compiler) this will be the most
>             performant and
>             most general option.
> 
>             However, this is still work in progress and we plan to focus
>             exclusively
>             on shaders for now (upcoming paper deadline). Kristian is
>             the contact
>             person for this work.
> 
>             In terms of schedule, I suggest that you look at the options
>             (1) and (2)
>             first as low-hanging fruits that will give us most of the
>             performance
>             optimizations already. You can then still look at (3) or we
>             might skip
>             this and go straight to (4) depending on how far we are
>             ready by that time.
> 
>             Feel free to contact Felix and Kristian as needed so they
>             can point you
>             in the right direction. We have lots of use cases for any
>             speedup that
>             we can achieve and so are looking forward to your work!
> 
> 
>             Hope this helps,
> 
>                     Philipp
> 
>             Am 04.09.2013 11:46, schrieb Jouni Mietola:
>             > Adding our CTO, Jarkko Vatjus-Anttila as cc recipient.
>             >
>             >
>             > On Wed, Sep 4, 2013 at 11:43 AM, Jouni Mietola
>             > <jouni.mietola at cyberlightning.com
>             <mailto:jouni.mietola at cyberlightning.com>
>             > <mailto:jouni.mietola at cyberlightning.com
>             <mailto:jouni.mietola at cyberlightning.com>>> wrote:
>             >
>             >     Prof. Dr. Philipp Slusallek,
>             >
>             >
>             >     I am a software engineer from Cyberlightning Ltd. We
>             are currently
>             >     working on EU's FI-WARE project and we are trying to
>             turn the xflow
>             >     modules into accelerated ones. Do you have
>             recommendations how to
>             >     approach this issue and are you planning to use webCL
>             in near
>             >     future? I have found out that you are using rivertrail
>             for parallel
>             >     computing. We are looking forward to use webCL.
>             >
>             >     Currently we are planning to use Nokia's webCL
>             prototype for firefox
>             >     which seems to be the starting point for the webCL's
>             future
>             >     development.
>             >
>             >     About the ongoing FI-WARE project:
>             >    
>             http://forge.fi-ware.eu/plugins/mediawiki/wiki/fiware/index.php/FIWARE.Epic.AdvUI.AdvWebUI.DataflowProcessing
>             >
>             >     WebCL Working Draft:
>             >    
>             https://cvs.khronos.org/svn/repos/registry/trunk/public/webcl/spec/latest/index.html
>             >
>             >     Nokia WebCL Prototype:
>             >     http://webcl.nokiaresearch.com/
>             >
>             >     Respectfully,
>             >     Jouni Mietola
>             >     Software Engineer
>             >     Cyberlightning Ltd.
>             >
>             >
>             >
>             >
>             >
>             >
>             >
>             >
> 
> 
>             --
> 
>             -------------------------------------------------------------------------
>             Deutsches Forschungszentrum für Künstliche Intelligenz
>             (DFKI) GmbH
>             Trippstadter Strasse 122, D-67663 Kaiserslautern
> 
>             Geschäftsführung:
>               Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
>               Dr. Walter Olthoff
>             Vorsitzender des Aufsichtsrats:
>               Prof. Dr. h.c. Hans A. Aukes
> 
>             Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
>             USt-Id.Nr.: DE 148646973, Steuernummer:  19/673/0060/3
>             ---------------------------------------------------------------------------
> 
> 
> 
> 
> 
> 
> -- 
> Jarkko Vatjus-Anttila
> VP, Technology
> Cyberlightning Ltd.
> 
> mobile. +358 405245142
> email. jarkko at cyberlightning.com <mailto:jarkko at cyberlightning.com>
> 
> Enrich Your Presentations! New CyberSlide 2.0 released on February 27th.
> Get your free evaluation version and buy it now! www.cybersli.de
> <http://www.cybersli.de/>
> 
> www.cyberlightning.com <http://www.cyberlightning.com/>


-- 

-------------------------------------------------------------------------
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) GmbH
Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschäftsführung:
  Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
  Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
  Prof. Dr. h.c. Hans A. Aukes

Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
USt-Id.Nr.: DE 148646973, Steuernummer:  19/673/0060/3
---------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slusallek.vcf
Type: text/x-vcard
Size: 441 bytes
Desc: not available
URL: <https://lists.fiware.org/private/fiware-miwi/attachments/20130916/7c7e145a/attachment.vcf>
Previous message: [Fiware-miwi] xflow + WebCL
Next message: [Fiware-miwi] xflow + WebCL
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Fiware-miwi mailing list
You can get more information about our cookies and privacy policies clicking on the following links: Privacy policy Cookies policy