Sistemas distribuidos y paralelos

jueves, 29 de marzo de 2012

Report #8

I make a research about How to replicate data from a server to another server, and I found some things interesting like Couchdb.

Couchdb is plugged into the web, all you could use HTTP for creating and making queries (GET/PUT/POST/DELETE...), and it's RESTful, plus the fact that it's portable and great for peer to peer sharing and provides a different model for data storage than a traditional relational database in that it does not represent data as rows within tables, instead it stores data as "documents" in JSON format.

We can handling using python like in this tutorial that explain How to make a replication I found here:

http://www.bytexbyte.com/simple-couchdb-replication-in-python

The objective that I have to this week and the next is try to make a Demo or something.

To start I'll type in the wiki how to install the enviroment.

My nomination for this week is:

Mr. X

jueves, 1 de marzo de 2012

Basic Concepts

Signal
Signal or Signals are an operating system feature that provide a means of notifying your program of an event, and having it handled asynchronously.

Wait
Just wait for synchronization operation something like lock acquisition to complete a task.

Address
It's the direction where a file or something is allocated.

Processor
A processor is the logic circuitry that responds to and processes the basic instructions that drive a computer.

Thread
A Thread is a concurrent unit of execution. It has its own call stack for methods being invoked, their arguments and local variables.

Distributed
Distributed computing is a field of computer science that studies distributed systems.A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal. A computer program that runs in a distributed system is called a distributed program, and distributed programming is the process of writing such programs.

Quality
Quality software is reasonably bug-free, delivered on time and within budget, meets requirements and/or expectations, and is maintainable.

Synchronization
Synchronization refers to one of two distinct but related concepts: synchronization of processes, and synchronization of data. Process synchronization refers to the idea that multiple processes are to join up or handshake at a certain point, so as to reach an agreement or commit to a certain sequence of action. Data synchronization refers to the idea of keeping multiple copies of a dataset in coherence with one another, or to maintain data integrity.

Port
A port is an application-specific or process-specific software construct serving as a communications endpoint in a computer's host operating system.It's associated with an IP address of the host, as well as the type of protocol used for communication.

User
A user is an agent, either a human agent (end-user) or software agent, who uses a computer or network service.

Event
A software message indicating that something has happened, such as a keystroke or mouse click.

Transfer
Transfers per second and its more common derivatives gigatransfers per second (abbreviated GT/s) and megatransfers per second (MT/s) are informal language that refer to the number of operations transferring data that occur in each second in some given data-transfer channel.

Source:

www.doughellmann.com/PyMOTW/signal/

people.csail.mit.edu/rinard/osnotes/h6.html
searchcio-midmarket.techtarget.com/.../processor
developer.android.com/reference/.../Thread.html
http://en.wikipedia.org/wiki/Distributed_computing
http://bazman.tripod.com/what_testing.html#what_quality
http://en.wikipedia.org/wiki/Synchronization_%28computer_science%29
http://en.wikipedia.org/wiki/Port_%28computer_networking%29
en.wikipedia.org/wiki/User_(computing)

http://en.wikipedia.org/wiki/Transfer_%28computing%29

My report #5

I explained what was happening in the code that was made in C, using the library MPI.Because we just had the code and nobody knew what was happening.

This is the link:

http://elisa.dyndns-web.com/progra/MPI

Now I don't have a clear idea what to do next week, any suggestions will be welcome.

jueves, 23 de febrero de 2012

Examples with CUDA

The first example that we will see is a vectors sum using CUDA, where we have two lists of numbers where we sum the elements from each list and store the result in a third list.

Now we have the source code where we are using CPU.

As you saw in the comments that program is just simple we add always the library "book.h" which we need to use all the features of CUDA, and I took a screenshot of the results.

[Image of results]

Now we have the source code where we are using GPU.

Editing...

miércoles, 22 de febrero de 2012

Running programs in parallel using Parallel Python

Parallel Python is a python module which provides mechanism for parallel execution of python code on SMP(systems with multiple processors or cores) and clusters(computers connected via network).

Parallel Python has good features as you can see below:

Parallel execution of python code on SMP and clusters.

Easy to understand and implement job-based parallelization technique (easy to convert serial application in parallel).

Automatic detection of the optimal configuration (by default the number of worker processes is set to the number of effective processors).

Dynamic processors allocation (number of worker processes can be changed at runtime).

Low overhead for subsequent jobs with the same function (transparent caching is implemented to decrease the overhead).

Dynamic load balancing (jobs are distributed between processors at runtime).

Fault-tolerance (if one of the nodes fails tasks are rescheduled on others).

Auto-discovery of computational resources.

Dynamic allocation of computational resources (consequence of auto-discovery and fault-tolerance).

SHA based authentication for network connections.

Cross-platform portability and interoperability (Windows, Linux, Unix, Mac OS X).

Cross-architecture portability and interoperability (x86, x86-64, etc.).

Open source.

The first thing what we need to do is just download a module of Parallel Python, as you see below I leave one link where you can find it, download whichever you want.

http://www.parallelpython.com/content/view/18/32/

Decompress it, and open the directory created, then type the following:

sudo setup.py install

We tested an example called sum_primes.py using two netbooks with Atom processor and a MacBookPro with i5 processor, before to see results I take some screenshots using just the MacBookPro.

And now the results adding the two netbooks.

As you saw in the second one we have less time considerable than the first one, another thing that we saw is the warning that said statistics provided adove is not accuarte due to job rescheduling, it may occurs because they work with a little porcentage of time.

I took screenshots from my activity monitor to prove that the processor is using the top.

Source:

http://www.parallelpython.com/

jueves, 16 de febrero de 2012

My report #3

I can't install CUDA in my Ubuntu and I crashed it, now I can't start with de GUI and I couldn't try those things that I learned.

What I think to do?

The first thing that I need to do is fix my Ubuntu, then install CUDA again, and after all try for some examples those I was following in a good book about CUDA programming step by step.Then I'll find some examples in POSIX.

Who is the highest contributor in this week?

Well, I think that is Cecilia Urbina, because She did a good job in some programs those I saw, and I understand some things where I had issues.

jueves, 9 de febrero de 2012

My second contribution

To my second contribution I read a book called CUDA by Example: An Introduction to General-Purpose GPU Programming you can find it in amazon as you can see bellow I leave the link to check out that book.

http://www.amazon.com/gp/product/0131387685/

I'll do some examples about CUDA programming because we need an introduction to this kind of programming environment.