locked
I'd like to learn more about Education & KICKSTART everything coded for parallelism in my data science research models. RRS feed

  • Question

  • I'm sort of trying to become the expert of efficient processing speed when it comes to modeling, simulation, and research for big data --- as I guess a data scientist and quantitive analyst --   I'm looking towards new ways of improving my small local computer's throughput, but --  eventually -- I'd like to think as scaling up to a big Massively parallel pooled Stack of computers -- and in the sense that each stack will need to process data efficiently.  Some would just have to be flash ram processors and a dedicated pool to actually writing the data to disk (which is fairly slow)..  

    I just want to begin my education and train myself to code in this way from now on.  ---  thinking in terms of not just running some program on my computer.   or local server.   but literally distributing a scaleable data stack.  Like a rack full of ASICS.   or GPU parallelism inside the stack of pools of data.

    My focus is on going from discrete stochastic analysis to a more continuous time analysis of data as it's fed via data service providers.     

    I'd basically call myself a student at this point.   Especially in the realm of MPP's  

    I've been learning through online resources and trying to get GPU's to work in whatever local machine i'm working on and employ a new way of coding things.  

    ------  ANY SUGGESTIONS on how to begin my journey?   where do I go to learn what I want to learn?

    I'm also on a current computer right now and trying to exploit it's GPU --  but it's having a problem recognizing the hardware.   

    I could use any suggestions.   I use OS X, Windows 10, and a very clean distort of Linux on about 3 personal computers.  this iMac tends to the be the fastest one, so I'm trying to clock out the GPU -- and recode all my project to exploit multiple threading.   Right now I'm using some basic scripting languages like Matlab and Python mainly, but I'd like to eventually delve into lower level programming tools like C++ one day.   I'm not there yet.   I'd like to transition between these though one day, because I like the fast rapid prototyping ability of Python ---   makes it perfect to sort of just build models off the fly and you're not inhibited by all the C++ problems.     --->   that said.   I'm not a C++ guy yet.   I'm mainly in Python R and Matlab.

    Pic below, got ATI Radeon HD 4850 card, trying to enable GPU acceleration 4 python Matlab C++ data analysis work, how do I get past this?

    • I'm using Python 3.6.  mainly with Accelerate library & CUDA drivers
    • Trying to use multithreading ability of parallel GPU processing.
    • I'm not sure how to setup right. SUGGESTIONS?
    • -----------------------------

    Wednesday, March 8, 2017 12:21 AM

All replies

  • Typically..  I need an education .  :)  
    Wednesday, March 8, 2017 12:22 AM