How to get the output of a command using EiffelProcess
This is a quite basic but essential operation: Launch a command, and get its output in a string.
There are various solutions.
- One is to redirect the output to a temporary file, and then read its content. But this is not convenient, since it require disk access, and you can not always hide the process.
- The most advanced (and still simple) solution is to use the Eiffel Process library.
Warning: you have to enable the multithreaded option to fully use the EiffelProcess pipe redirection.Warning: most of the time, you should precise absolute path to executable (such as "/bin/ls" instead of just "ls")
Multi-threading vs. pub/sub
Thanks for the nice tutorial. Still, I think using multi-threading for the process library is a bad, bad choice and the library design is flawed.
Multi-threading and publish/subscribe just do not fit together, but the process library just does that. Also, it should be possible to interact with a subprocess, even with classical, sequential code. What do you think?
The original author of the library should certainly comment on that, but I was under the impression that if you have a deadlock due to an incorrect protocol of reading/writing then a multithreaded solution is a safest and simpler solution to ensure that you can still do something while waiting for inputs/outputs without having to change the logic of your program much.
You are right that the application should never sit and wait for the data. But there are ways to solve this without multi-threading. The fundamental flaw is that we still develop our programs in a sequential, batch driven style when they should be developed using publish/subscribe everywhere.
The right way to write modern, interactive applications is to use a main loop that sits on a POSIX select and reads input from many sources (GUI events, open files, network connections) and processes this inputs as they come in, never blocking for a specific input. Then, a subprocess would just be another one of these input sources. That is what I tried to sketch in http://www.eiffelroom.com/blog/schoelle/what_do_applications_do_after_make
When you can keep it simple and that you do not need much CPU processing I agree that an event loop is all what you need. But as more cores are available to you, using threads is definitely the right way to go.
I fully agree with you again, but ...
I fully agree with you again, but (the famous "but") multi-threading is inheritly non-modular. Thus, we cannot easily constrain and reason about the composition of multi-threaded programs. So, I see currently only two ways of safely working with multi-threading:
The process library, as it is, will leak out threads and that is very difficult to keep under control.
SCOOP is a promising approach to overcome the composibility problems of multi-threading, but as long as SCOOP is not fully understood or implemented, we have to continue living with threads.
multiple cores do not imply multithreading
The availability of multiples cores doesn't imply the use of threads. Any concurrency facility will do to make use of the multiple cores, and while some such facilities might use threads as a lower level implementation mechanism (in the same way as a garbage collector employs manual memory management internally) there is no need to expose threads at the user-code level. And even at the lower levels, threads are not the only implementation strategy, you could have one OS-level process per available CPU each dispatching events and communicating using one of the available interprocess communication mechanisms.
Personally I vote for the thread library be removed from Eiffel Studio and replaced by something less mediocre which fits better with the Eiffel philosophy.
I would certainly agree with you on not using threads, but how to you assign work to do on various cores without using threads and in pure C?
As for a better alternative, I think this is the holy grail of concurrency, so everyone is in the race to reach that goal.
fork()
in C to launch multiple processes, you can use any of fork(), spawn(), system().
for communication, the output is free with fork (the child can read the parent). if you need a reply back you could use files or pipes, e.g. stdin/stdout) something akin say in the unix "inet" daemon, where each connection is a process whose stdin/stdout are mapped back to network connections via the inet daemon.
standard C has had support for multicore programming long before there was a standard thread library (and you could say posix threads are not part of "pure C" depending on how you define it).
arguably still some step away from the holy grail :-)
I would not use fork/spawn/system as a way to perform multicore programming it is largely inefficient. Multithreading programming in that respect is much better.
What I meant by assigning some computation is that if you know that you have multiple cores and have an algorithm that could be parallelized, how to do that in C without using fork/spawn/system or threads for that matter.
well yes you can't (easily) exploit multicore programming in the purely sequential subset of any programming language, but that's beyond the point.
fork() is not inefficient if you use it appropriately. if you're very much in a hurry, you can fork one process per core at startup time, and message pass it things to do, and that reduces the alleged cost to effectively zero.
quick googling shows some guy measuring fork cost at 1 to 2 milliseconds per fork, and while thread startup is faster, you just need to partition your task so that each process handles say 100 or more milliseconds worth of work to have the cost of forking neutralised. there's not that many multithreaded apps that really do hundreds of thread creations each second...
and you can also implement a non-multithreaded programming model on top of a multithreading implementation where only the concurrency library uses threads explicitly. so there's no excuse for making threads visible at the application code level.
More than 1 process per processor
You will normally want more than one process per processor (the exception being 100% CPU bound algorithms) in order to drive the processor complex at close to 100%.
A first approximation to the required multi-programming level is:
MPL = number_of_processors / processor_busy_proportion_for_one_process
This is assuming your program has the whole machine to itself.
Colin Adams