2 Editorial
Bulletin N˚2
October 2011
Let’s do it!
4 Special:
• Grid
for bioinformatics: the challenge of
the region
11 Reaching the first year
• Itacuruçá:
meeting point
• Agreements
• New User
• New version of OurGrid
• Dissemination
13 Interview
• Distributed
America
computing as a service in Latin
17 Two countries steadily in the e-Science
• A
priority for Argentina
• Colombian Authority
http://www.gisela-grid.eu/
@gisela_grid
General Coordination
Herbert Hoeger
Journalistic Work
Ysabel Briceño
Design and Layout
María Eugenia Hernández
Translation
Alicia Bohórquez
Editorial
Let’s do it!
Francisco Brasileiro
WP6 Manager
GISELA´s grid infrastructure has just completed one year and matured as a substantial source of computing resources. More than
1,700 cores are there, available for GISELA´s users right now. Yet
we seem to be facing a problem that was also experienced in the
EELA-2 days: there are not enough people taking advantage of the
infrastructure. As representative of the infrastructure providers I
urge users from all Virtual Research Communities to do it, after all,
as Porter’s lyrics says “even lazy jellyfish do it ...” so, “... let’s fall in
love!”
Of course, courting requires efforts from both sides. We, providers, have made some concrete steps in the direction of attracting
users to the infrastructure. In particular, the main objective of Work
Package 6 — Infrastructure and Application-oriented Services for
User Communities — is to make it easier for users to get their applications running in the infrastructure. This has been attempted in
many ways.
Firstly, we have adapted most of the services that were inherited
from the EELA-2 project, so that they can be used in the GISELA
infrastructure, with the latest versions of the grid middleware that
have been deployed. These are services that can easy the process
of porting an application to run in the grid, and also services that
can facilitate the incorporation of resources to the infrastructure.
Secondly, we have tried to address one of the main complaints of
users, which is the difficulty to adapt their applications to the execution environment that they find in the grid. The DIRAC middleware
aims at mitigating this problem. A few users are already benefiting
from the facilities offered by DIRAC and the DIRAC support team is
eager to help many more.
gi ela
2
Editorial
Finally, we have also addressed another point of concern, which
is the management of large batch of jobs, commonly called bagof-tasks applications. This is a usual class of applications. Both
DIRAC and OurGrid have facilities that can be used to track the
execution of large batch of jobs, making sure that jobs in the batch
that fail are resubmitted automatically and transparently. Another
feature that is implemented by OurGrid is the tracking of slow jobs
that defer the conclusion of the batch. This is a common problem in
grids, due to the inherent heterogeneity of the resources that comprise them. We have also developed a script that can be used to
run bag-of-tasks jobs using the OurGrid middleware over gLite resources. This is particularly useful for those applications whose batches are composed of long running jobs (dozens of hours). These
are not suitable to be executed on the opportunistic environment on
which the OurGrid middleware is typically used.
In summary, there has been quite a lot of work placed on developing services that can help users to have a much more productive
experience with a grid infrastructure. Also, there is a support team
that is looking forward to help users to benefit from the GISELA
infrastructure. So, come in large numbers, we will be delighted to
help all of you!
gi ela
3
Grid for bioinformatics:
the great challenge of the
region
The sequencing and other advanced activities in biology, such as gene prediction, genome assembly and
simulation of evolutionary processes require today computing for management and analysis of those
billions of data resulting from this part of science
One area of research that tends to produce increasing amounts of data is related to the biological sciences. After unveiling the secrets of
the double helix of DNA, the great challenge of
elucidating the human genome has led to a high
standard of information management. Thousands
of bases are present even in the simplest organism, so that the reading of DNA has also required
technological efforts to achieve the speed and accuracy required in such experiments.
evolutionary processes require today computing
for management and analysis of those billions of
data resulting from this part of science, in a work
area called bioinformatics.
These set of techniques and biochemical methods
has effectively enjoyed the benefits of computing
to achieve solutions that for its time and costs are
called of high performance. The sequencing and
other advanced activities in biology, such as gene
prediction, genome assembly and simulation of
During the Grid management in Latin America
(with EELA, EELA2 and more recently GISELA),
researchers in bioinformatics together with technical teams have dealt with experiences that helped
develop or adapt applications in the infrastructure
available, with different levels, whose most suc-
gi ela
In the research projects involving these activities
a computing and storage power is required which
often are not available in laboratories in Latin
America so that the Grid infrastructure has been
sensitive to be tackled by this area.
4
cessful developments originated in the
laboratories of Brazil, Mexico, Colombia
and Spain, as European partner.
As inheritance, GISELA has tens of applications in the area of bioinformatics
that have been run on the Grid platform, specific to the use of heuristic algorithms, numerical linear algebra, molecular biology and genomics, generated
during a testing and learning process
that involved the detection of needs, development or adaptation of applications,
training of technicians and researchers,
to finally achieve a service of high performance computing in this area.
• The
experience of followers
waiting
DistBLAST (Distributed BLAST):
Many of the computational methods used in
genomics use BLAST, a basic tool used to compare biological sequences. BLAST is a very efficient tool but uses large databases and search
entries. Since bioinformatics laboratories in the
midwest region of Brazil needed to run BLAST
with many biological sequences, several times
a day, DistBLAST was proposed during the
EELA project to accelerate this process for the
researchers. It was hoped that the results with
DistBlast would occur in a shorter time, which
could improve the performance of BLAST. The
University of Brasilia (UnB) was the body responsible for initiating the use of this application.
Maria Emilia T. Walter, bioinformatics researcher from the group
at UnB, said: “The main advantage of the Grid platform is to
treat huge volumes of data, using broadly used bioinformatics
computational tools.”
T. Walter explains that there is the intention to continue its gridification on this platform with other middleware. “DistBLAST was
developed with OurGrid and we want to do it now with gLite.
These tools are very useful for research projects in genomics,
so we want to make another gridification so that this region of
Brazil can use them.”
DistBlast
Maria Emilia Walter (UnB): “We want to
continue the gridification of this tool.”
BioNMF-Grid:
BioMFN is a Web tool that applies the Non-negative Matrix Factorization methodology in different analysis contexts to support some of the most important applications in biology.
gi ela
Explains Alberto Pascual, a member of the Bioinformatics Group at the
Universidad Complutense de Madrid (UCM), that this Web application is free to
5
use and anonymous (all jobs are launched using the same grid user) and runs an
algorithm with large computing power needs. The application was gridified for the
EELA-2 project and was developed to increase the productivity of the version
“local cluster” of the UCM. “Before the gridification process, we tried to install a
private grid in our laboratory for testing, but given the very low gLite 3.1 existing
documentation at the time (especially in terms of installation and configuration),
it was decided to directly use the EELA-2 production infrastructure through the
‘public’ User Interface (UI) from CIEMAT (Spain) ... The
implementation process was not overly complicated,
since we had attended gLite 3 management courses at
the user level. One of these courses was on line using
the GILDA infrastructure.”
This application is currently active in the local cluster version of the UCM, but with limited work load.
According to Pascual, “Grid technology has much to
contribute to applications such as BioNMF” so
Alberto Pascual (UCM), “The Grid
the aspiration is to have the appropriate Grid intechnology
has much to offer to
frastructure, as several scientific teams have reapplications such as BioNMF.”
quested the use of a version with no limits.
BioNMF-Grid
META-Dock:
The purpose of this application is to provide a filtering method based on
Grid for pharmaceutical studies. When a protein that causes a disease
is known, the next step is to identify a molecule that is capable of preventing the action of this protein. As an alternative method for these
purposes in silico simulation has been used by means of a molecular
docking program. AutoDock was initially proposed, a program with good
recognition in the academic world, but the idea is to use more than one
molecular docking program. The organization responsible for the initial use of the application was the National Autonomous University of
Mexico (UNAM).
Explains Jérôme Verleyen, researcher linked to this tool, that its gridification took place during the EELA-2 project. “We use many tools
developed by the project for inclusion in the program. The initial step
for the gridification of Meta-Dock was to allow access to important
resources for a broader study of protein docking. The gridification
was carried out without much problem during one of
the Grid Schools. There I was able to integrate most
of the tools available in the Grid, with high level tutors. That allowed me to have an overview of those
tools, and continues to serve me so far. “
gi ela
Jérôme Verleyen (UNAM) looks for interested
groups for the development of META-Dock
6
Verleyen is enthusiastic about the potential use of this application on the Grid platform, so he recommends it to the bioinformatics community interested in this area. “I would like to find interested groups to
keep developing this application.”
- Do you think this application could be adapted to other areas?
- I think so, the pattern of use of this tool is subjected to other tools. Now I can’t think in which one exactly, but to study a thick set of data to choose the best one is common in science.
GrEMBOSS
EMBOSS (“European Molecular Biology Open Software Suite”) from EMBnet
(European Molecular Biology Network) is an analysis open source project software package developed for bioinformatics. This gridification resulted from a joint project of three institutions of the National Autonomous
University of Mexico (UNAM): The General Direction of Academic Computing
Services (DGSCA), the Center for Genomic Sciences (CCG) and Institute of
Biotechnology (IBT). EMBOSS includes some 150 packages and tools for
analyzing genome sequences in a perfectly integrated set. With EMBOSS,
the researchers can do sequence alignment; quick database search with
sequence patterns, protein pattern identification, analysis of nucleotide sequence patterns, rapid identification of sequence patterns in large scale,
and use presentation tools for publication.
“EMBOSS is a package of tools for genomic
sequence analysis widely used and developed
to support the work of the bioinformatics community, said Romualdo Zayas-Lagunas, a
researcher at UNAM. The first time EMBOSS
was migrated onto a GRID infrastructure was at the First EELA Grid School in
Itacuruçá Grid (Brazil) in December 2005.
Subsequently, the project was presented at
the 3rd EELA Conference in Catania 2007.
Finally, GrEMBOSS was the project of
the National Autonomous University of
Mexico that was included in EELA-2”.
Romualdo Zayas-Lagunas:
GrEMBOSS for sequence analysis
Zayas-Lagunas explains the advantages
of this tool used on the Grid platform:
GrEMBOSS was built thinking that users do not have to worry about
knowing how to build a JDL file. Includes a script that builds a file and
makes a package with those files needed for execution. The data recovery is automatic and the user is not aware of where he performed
his analysis.
“The groups that work with bioinformatics may be interested in
GrEMBOSS for sequence analysis. From our point of view and taking into account the proper design of EMBOSS we can think of “gridifying” other bioinformatics applications without any further requirement to follow the EMBOSS standard software development.”
gi ela
7
jModelTest
The application MODELTEST and its Java version jModelTest was
developed by Professor David Posada of the University of Vigo
(http://darwin.uvigo.es) and has more than 30000 registered users around the world, being basic to phylogenetic studies because
they provide the nucleotide substitution model that best fits an organism. Its gridification was generated recently by the Centro de
Investigaciones Energéticas Medioambientales y Tecnológicas in
Madrid (CIEMAT). Any scientist in the world with an interest in the
phylogenetic field or in investigations that require models of nucleotide substitution is a potential user of this application.
Challenges
Advances in high performance
computing have not been far
from negligible: today these
tools can activate mechanisms
that process millions of sequences at once, without having to stop the world to wait for
results. However, the challenge
increases faster than the possibilities: the amount of data
generated by sequencing organisms has been of such magnitude in recent years that it cannot be said with solvency if you
ever can solve all computational
problems with the current computation techniques.
Said Manuel Rodriguez, CIEMAT researcher and the link responsible for
this tool, that it is considered a major breakthrough the migration of
JModelTest to the Grid so that it is accessible to researchers via a distributed computing rather than sequentially.
“Having a new distributed version that
replaces the sequential one, you can
get results faster.” The previous parallelization work of tasks
Manuel Rodríguez: “With a distributed
was performed at the
version is possible to obtain results faster”
Supercomputing Center
of Galicia (CESGA).
The application is in production and available for use. “Right
now research work will run on the GISELA infrastructure. A
development CIEMAT team is working on a new version with
a more efficient execution planning and implementing a web
portal for transparent job submission.”
For Rafael Mayo, CIEMAT researcher and member of the
GISELA project, “today there
are problems that the grid cannot solve satisfactorily given the
large number of results generated. The size is so large that it
takes much more time to bring
those results back to the researcher’s computer compared
to the real calculation time”.
for Nuclear Research (CERN, for its acronym in French) generates
enormous efforts to improve the transfer of large volumes of data
due to all measurements and calculations that are taking place
within the Large Hadron Collider (LHC).
New solutions and breakthroughs are creating new problems in this large-scale production of scientific data. For example, High Energy Physics heads
the biggest challenge: the staff
of the European Organization
In the case of the Latin American Grid, the effort to address the difficulties in handling large volumes of data must also be a constant.
After the testing phase in gridification tools, GISELA points to the
formation of stable communities of users. Through the experience
in using gridified applications, the proposals have been the basis
for very specific academic projects. It is now necessary to begin to
reap the fruits of this learning.
gi ela
JModelTest
8
Towards a reliable infrastructure:
As main limitations posed by users of bioinformatics, the need for a reliable infrastructure, well documented and a good job
scheduling system shows up. To provide
an adequate service has been the biggest
challenge.
In this regard, Francisco Brasileiro, head for
the Infrastructure and Applications Service
Oriented for User Communities package of
the GISELA project, sums up the progress to
face the major difficulties of advanced computing services, under the Grid platform in
Latin America, one of which is the adaptation
of applications to the runtime environments.
“The DIRAC middleware aims at mitigating this
problem. In addition, our
UNIANDES colleagues,
Harold Castro leading,
have followed a different approach to simplify
the deployment of applications on the Grid.
They are responsible
for the development of
a new service called
Customized Virtu- Francisco Brasileiro: “We work
al Cluster, or CVC.
to optimize the capabilities of
The CVC uses virthe hundreds of machines in
tualization
tools
the computer labs”
such as VMware,
which lets you take advantage of the capabilities
of tens or hundreds of machines in the computer
labs. CVC is in beta testing at UNIANDES, and the
first to want to experiment with CVC are welcome.”
• Molecular
click
The GISELA working groups supporting communities, infrastructure and applications, together with
the Universidad del Valle in Colombia have led to
a fruitful collaboration to run on the Grid platform
tools for managing chemical information with high
computational requirements.
With respect to efficiently handling large batches of
jobs, Brasileiro says that DIRAC and OurGrid have
facilities for this purpose. “The OurGrid scheduler
uses a simple replication policy that replicates these
slow jobs, mitigating the unfortunate assignments
that led these jobs to run slower than the others.”
All information on the portfolio of services is available on the GISELA Project website (activities of the
Working Group 6-WP6-)
gi ela
predictions with a single
Gaussian has been successfully integrated into
the Grid, by which researchers at the Universidad
del Valle are able to predict parameters of Nuclear
Magnetic Resonance (NMR) with a single click
and have powerful and friendly tools for the rational use of available resources in Europe and
Latin America, thanks to an interface called
myLims.org.
The myLims.org system (Laboratory Information
Management System) allows storing, manipulating and sharing virtually any type of experimental data and to predict molecular properties in a
way that users can add features and integrate
their own programs to predict more properties
from structure, spectra, or any information already available from the system.
The processes created by LIMS rely on DIRAC
for sending jobs to the Grid. A series of steps
provide for the control of the runtime environment and the installation of the software
needed to perform the tasks generated; after
several tests, the results (output files) of recovered tasks would be sent back to the LIMS
platform to be presented to end users in an
easy and friendly way.
This Grid initiative has been supported
by the European
Commission (through
project GISELA # 261
487), RENATA (RC
Colciencias 561-2009),
CNPq, CEFET / RJ –
through a DIPPG grant,
and was initiated by the
collaboration with the
W-eNMR project.
9
Grid as an alternative:
Another major challenge for the use of the Grid
platform, point out by the users of bioinformatics,
is the growing trend of Cloud technology. Two basic concepts are opposed in these two proposals:
Grid is a collaborative scheme, with free access
to resources, and Cloud computing is an on-demand distributed computing proposal. With the
idea of considering GISELA as an alternative in
the region to the paid cloud computing, Philippe
Gavillet, deputy coordinator of the project
comes to a categorical
statement: “The answer is yes. GISELA is
a powerful e-Infrastructure. We have more
than six years of experience in the field of
e-Science in Latin
Philippe Gavillet, GISELA Deputy
America, with proProject Coordinator: “We are a
jects funded by Eupowerful e-Infrastructure.”
rope and have been
• Some
present in almost all countries of this region. We
are a wide coverage e-Infrastructure for scientific
collaboration. Currently, the topic of Cloud is on
the table and there are many variations around:
public or private, free or paid. In this sense, necessarily we are going to have to face a review.
We are working on a solution that fits the context
of Latin America.”
GISELA ensures scientists adequate access to
distributed computing resources, support for the
migration of scientific applications to the Grid infrastructure, from its development to implementation and supports the use of e-Infrastructure and
applications related services. It currently provides
a storage capacity of approximately 60 TB by
means of 18 resource centers to support scientific
research groups and we aim to continue growing.
For more information, those bioinformatics communities in Latin America interested may contact
[email protected]
published reports about Grid and Bio
bioNMF: a web-based tool for nonnegative matrix factorization in biology. Nucleic Acids Research 2008 v.36,
Oxford University Press 2008.
Estimating Conductivity Distribution of
Transmural Wedges of the Ventricle
Using Parallel Genetic Algorithms.
MARTINS, Daves Marcio Silva; XAVIER, Carol
Ribeiro; SANTOS, Elisa Portes dos; VIEIRA,
Vinícius da Fonseca; OLIVEIRA, Rafael
Sachetto; SANTOS, Rodrigo Weber dos, In:
Computers in Cardiology 2006, Valencia (Spain),
Computers in Cardiology v.33, p.49-52, 2006.
IntegraEPI: a Grid-based Epidemic
Surveillance System. In: HealthGrid,
InterProScan: protein domains identifier. E. Quevillon, V. Silventoinen, S. Pillai,
N. Harte, N. Mulder, R. Apweiler and R. Lopez,.
Nucleic Acids Research v.33, Oxford Journals
2005.
Managing structural genomic workflows using Web services. Maria Claudia
Cavalcanti et al., Data & Knowledge Engineering
v.53 Issue 1, p.45-74, Elsevier 2005.
Performance Tests of GAMOS Software
on EELA-2 Infrastructure. In: 2nd EELA-2
Conference, Choroní (Venezuela), Procedings of
The Second EELA-2 Conference ISBN 978-847834-627-1, p.379-385, Editorial CIEMAT 2009.
Geneva (Switzerland), Proceedings of the
HealthGrid Conference 2007.
gi ela
10
Reaching the first year
During the second semester of the GISELA project, activities focused on defining joint action
agreements between the various working groups, addressing technology updates, promote
the Grid platform and, primarily, to review possible ways to the big challenge of transferring to
RedCLARA a distributed computing model available to research communities in the region.
Agreements
In April, GISELA was part of the projects present
in the EGI User Forum 2011 (Vilnius, Lithuania)
and, seizing the opportunity, formalized agreements with EGI.eu., EGI.InSPIRE and e-ScienceTalk, an European e-Infrastructure trends outreach project.
The agreement with EGI.eu and the project EGI.
InSPIRE will expand the collaboration between
GISELA and the flagship project of Grid in Europe.
The agreement with e-ScienceTalk aspires to the
joint work with GISELA for the dissemination of
Grid applications in Europe and Latin America,
using mechanisms such as GridCafé, GridCast,
iSGTW and GridGuide, and other initiatives that
may arise to enable public understanding about
the use of the Grid infrastructure.
• New
User
During the month of June the Computational
Modeling Center -CMC- from La Universidad
del Zulia -LUZ- expressed interest in joining
as a Grid user and share resources in Latin
America, for which the first talks began in
Venezuela to define the technical and protocol details for the integration of this facility in
the GISELA project.
The CMC is an institution based in Maracaibo
(Venezuela) consisting of researchers from
very diverse and interdisciplinary areas that
work in modeling, simulation and scientific
visualization of physical phenomena, meteorological, chemical, biological, medical, economic, and others.
As Gilberto Diaz announced, manager of
the networking services working group of
GISELA, this center has already been working with advanced computer applications in
the area of climate forecasts. “The intention is
to bring this community to GISELA and to the
Venezuelan Grid.”
The next step aims at signing a memorandum
of understanding to integrate the applications
of this center to the GISELA project and create a Grid resource center in LUZ.
gi ela
11
Itacuruçá: meeting point
In late March GISELA members reviewed the progress of the major regional initiative of Grid infrastructure in Latin America and its
respective appropriation by the research communities. Itacuruçá
(Brazil) was the meeting point where during three days, project
working groups reviewed aspects of services, networks, community
support, dissemination and training in Grid.
Much of the time of the meeting was devoted to reviewing the work
plan proposed to transfer the Grid model to Latin America, with a
sustainability vision; this task, in which RedCLARA plays a central
role, is distributed among the members of a liaison group with technical and organizational responsibilities to build a model of distributed computing services in the region, which is the main challenge
of the project.
They all stressed the significant progress at this meeting to better
define mechanisms for joint work between members of GISELA,
whose groups come from different countries of Latin America and
Europe. Three days of hard meeting and face-to-face work showed
the need to discuss the different perceptions on how to carry out
the objectives of the project and find common solutions.
• New
version of OurGrid
In August, GISELA announced the launch of the
new version of the OurGrid middleware, a regional development that in the context of EELA,
EELA2 and GISELA has been used by hundreds
of users to accelerate the execution of parallel
applications whose tasks do not communicate
among them.
OurGrid middleware was developed in 2004 and
has since been updated in several versions that
have improved its performance, generating an active community of users and developers around
this stable, free and open source software. Under
the logic of OurGrid, the storage and computing
resources are provided by
participants of a network
to share and obtain maximum performance, proportional to the contribution
provided.
gi ela
So far OurGrid has five
versions since its first development. This latest version (4.2.4), optimized, is
already available here
• Dissemination
Since last May, GISELA distributes a monthly informative summary, with brief notes on project progress and trends concerning the Grid infrastructure
in Latin America.
The summary is received by a network of countries in the region, with the support of RedCLARA
Communication Management and some institutions
such as CUDI (Mexico) and Renata (Colombia),
whose contributions have facilitated the distribution
of Grid news in Latin America, driven from GISELA.
It has also counted with the European initiative for
Grid dissemination: International Science Grid this
Week (iSGTW), where permanently notes from
GISELA’s briefing summary have been published,
linked to its disclosure place
12
Interview
Distributed computing
as a service in Latin America
The new model of Advanced Computing (AC) services in Latin America seeks to build a bridge between the
e-Infrastructure established and those emerging initiatives. As a challenge, it is posed to build regional
capacity, under equal conditions for the countries. To this end, collaboration networks, training and
sustainability of virtual research communities are the ideal target. Industry and business are also included
within the scope of the service.
Salma Jalife and Luis Núñez
are responsible for proposing
the Business Plan for the distributed computing service in Latin
America, after the experience
of lifting the Grid infrastructure
in the region. The main purpose is to define a viable strategy
to ensure a smooth transition
from the current GISELA model
(e-Infrastructure and support
to research communities) to a
RedCLARA model. As post GISELA ideal scenario, it aims at
integrating the grid services to
the general Advanced Computing services between the national networks in Latin America,
with the support of RedCLARA.
GISELA interviewed them especially for this edition, after one
year of the project.
gi ela
What has been the main challenge in
the development of the proposed Business Plan for distributed computing
service in Latin America?
?
SJ: The diverse country members of RedCLARA have di-
fferent degrees on technical maturity, so to consider any new
service, that is sustainable, is a challenge. In this case, there
are 11 of the 15 country members in RedCLARA that participate in GISELA, and the challenge is greater because in these countries there are very different levels of development of
the Grid infrastructure and in the capabilities that the various
representatives of Resource Centers have succeeded to develop. Furthermore, a critical mass of researchers, business
and general users needing to migrate their applications to a
grid infrastructure has not yet been achieved. Now the challenge will be to include grid services in a package of Advanced Computing services using the CLARA model.
13
What has done about this RedCLARA, in this
first year of GISELA?
LN:
?
During this first year of GISELA, RedCLARA representatives
have focused on raising awareness among regional and national
decision makers on the importance of supporting the creation and
operation of e-Infrastructure and, in general, to promote the development of e-Science in Latin America. Some initiatives have been
carried out successfully. For example, RedCLARA has supported
virtual research communities, from which we derive the potential
users of Advanced Computing services in the region, one of which
is grid computing. Through its national networks, CLARA has supported the creation of a strong group made up of national institutions that have acquired different national grid initiatives such as
the National Grid Initiative (NGI) or the Equivalent Domestic Grid
Structure (EDGS). For example, Colombia and Ecuador have made
their NGI, Mexico has a regional Grid Operation Center (GOC), and
Argentina has expressed its intention to create a group of e-Science to be responsible, among others, for the development of their
national Grid infrastructure.
The initial proposal was to
adapt the model developed in
Europe in the Grid service to Latin America. Has it been viable?
?
LN:
Luis Núñez: Grid technology is a step in the
evolution of high performance computing
gi ela
According to this model, each country
should organize a National Grid Initiative, primarily in charge of promoting the services of Advanced Computing; the academic and research institutions should be part of the NGI in each country
and be responsible for establishing the rules to
integrate the e-Infrastructure, coordinate and promote the use of services between virtual research
communities. But in order to successfully carry
out this model, it is required for each country to
effectively have a national initiative well-established, for national networks to be sustainable and
already with an institutional and organizational
maturity in the development of collaborative projects, integrating their communities in an e-Science proposal, supported by public policies. Unfortunately, this scenario which has indeed been
achieved in Europe after years of collaboration
between governments, national networks and research communities, has a long way to go in La-
14
tin America. With few exceptions, countries have built their national
networks with different organizational schemes and it still remains a
major challenge to convince governments that advanced networks
are part of the State and which may play a role in the development
of their countries. Taking into account the reality of Latin America,
the current state of national networks and their organization does
not allow at this moment the European model to be implemented,
so CLARA designed a more flexible model, disaggregating some
services that aim to create opportunities for sustainability. We are
proposing to create specific computing services, which allow them
to be configured by the user for a limited time. Cloud computing
services that can be rented not only by research groups but also by
national networks. Grid computing should be a lever, an excuse to
promote Advanced Computing services in the region.
Then, what is the path that has been
drawn in the first year of discussion on
the Grid services in the region?
SJ: RedCLARA has evaluated different ways on how
Salma Jalife: the challenge is greater
because there are different levels in
the development of grid infrastructure
?
to introduce Grid services in the region under an architecture of several countries. After a series of group discussions with the CLARA
transition team (TT) and the interaction with the CLARA Marketing
Manager, it was decided to build a Business Plan that is viable not
only for a single service, but also to a set services that have several
things in common. The reasoning behind this decision is that the research communities have not yet achieved a critical mass to consume grid services, except for the community of high-energy physics.
Other communities are learning where and how to make effective
use of e-Infrastructure and run some applications. Moreover, with
the commercial launch of Cloud services, some communities are
exploring different combinations of high performance computing
and storage techniques to meet their needs. So far, the Business
Plan defines a group of Advanced Computing Services, which includes the Grid service and the use of the e-Infrastructure.
Balance so far?
LN: It’s not an easy task to handle such differences in
?
Latin America. Only those countries that have the capacity to interconnect resources and make available hundreds
of CPUs will be able to form a national grid initiative. Countries that
are not able to achieve this type of organization should be treated differently and you need to find a common solution. Given this,
RedCLARA proposes a similar structure to the one that currently
characterizes its activities with other services. In this case, a Re-
gi ela
15
Evaluating the
operational model
gional Network Operations Center would cover those countries with
no national grid. Gradually, after the growth of the national network
infrastructure, they would begin to be able to install their own resource centers. Once done, the functions would be transferred to
national networks. Now the plot thickens with the ephemeral nature
of these advanced technologies. Grid technology is a step in the
evolution of high performance computing. We can’t think of organizing academic services focusing on technology that could change.
We have to consider organizing computing services for research
groups that are not focused on a particular technology.
Do you mean that, beyond a particular discussion of transferring grid services, it is
about continuing to define mechanisms to
mature the regional network into advanced networks for research?
LN:
?
RedCLARA has to find the most efficient way to operate
in the region, with a minimum budget and maximum benefit to its
members, without sacrificing the quality of services. Collaboration
and technology transfer are the main assets of its members. RedCLARA in recent years embarked on an ambitious project to support the construction of national networks by strengthening specific
research communities. Brazil and Mexico’s networks were the main
partners in this project. National agencies responsible for Science
and Technology public policies, Higher Education and Telecommunications play a specific role in the development of advanced networks. Some countries have responded very positively. However,
there is still a big gap that policy makers will have to fill in the idea
of integrating advanced networks as a tool for innovation and technological development in each country.
gi ela
Grid infrastructure and Advanced
Computing (AC) services could
follow the same process as other
RedCLARA services. So far, we
evaluate the following proposed
transactions:
Countries that can support the
creation of a GOC (Grid Operations Centre) will have one, and
those who can’t afford one or are
not sufficiently trained to create
one, can use the Regional Operations Center (ROC). CLARA would
lease the operation of the regional
NOC (Network Operation Centre)
every four years, and the NREN
(National Research & Education
Network) that provides the most
staff to the operation, among other
requirements, would operate the
RNOC during that period. You
must create a new AC technical
group, which can follow the evolution of services.
The national GOC would be responsible for its national services
and would interact with the ROC
when the services have a regional
scope. The ROC will take care of
domestic functions of those countries without GOC. RedCLARA
would use a minimum of staff to
provide services of the ROC, as
most services will remain under
the RC (resource centers) distributed in the different participating
countries. The institutions or the
National Science and Technology foundations, as appropriate,
should cover operating costs. At
GISELA’s end, the CLARA TT
would take possession and RedCLARA will have to auction the
ROC operation among the participating institutions.
16
Two countries
steadily in the e-Science
During the second semester of
the GISELA project, Latin America received two good news involving advances in e-Science,
in Colombia and Argentina, two
countries that strengthen their
efforts to consolidate their organizational and technical infrastructure for the academic use of
advanced networks.
A priority for Argentina
A number of universities and research centers in Argentina
signed an agreement to establish a national e-Science consortium, to promote Grid in research and education in both
public and private institutions, under a model inspired by the
GISELA project.
The collaborative network, coordinated by INNOVA-T, aims
to promote the development of the Grid infrastructure in
Argentina, as well as to increase awareness in the scientific
community, industry, government and other segments of
society about the potential uses and benefits of the technologies associated to distributed computing.
This initiative reinforces the objectives of the National High
Performance Computing System (SNCAD for its acronym in
Spanish), promoted by the Ministry of Science, Technology
and Productive Innovation of Argentina and the Interagency
Council of Science and Technology to consolidate in this
South American country the collaborative work that would
meet the growing demand of
the scientific and technological community in the areas
of storage, grid computing,
high performance computing,
visualization and other emerging technologies.
National High Performance
Computing System
gi ela
17
Colombian Authority
Colombia joined the list of Latin American countries that have
certification authorities associated with Grid services. The
Colombian Certification Authority will be the Andes University in Bogotá (Uniandes for its acronym in Spanish), instance
already accredited by the organization that oversees policies
and standards of all the Grid certification authorities- IGTF
(International Grid Trust Federation) .
The certification authority can reliably endorse the subscription of users and servers in the grid, within a stringent security
policy that ensures common guidelines on the use of this platform. Thereafter, the Colombian interested in using the Grid
platform will have the certification management in their own
country, through Uniandes.
To be recognized as a certification authority, national authorities must go through a long protocol process to thoroughly
review the technical requirements, infrastructure, personnel
and ethics. In Latin America, the countries with national Grid
certification authorities are: Argentina, Brazil, Chile, Mexico,
Venezuela, and now Colombia.
Colombian Authority certification
IGTF
gi ela
18
tailored to the needs of Latin America
A large amount of computers and storage
provided by the project partners, is now
available for groups of scientists working
on problems that demand high quantities
of computing resources, that without this
e-infrastructure would be difficult to solve.
http://www.gisela-grid.eu/
Download

Editorial Special: Reaching the first year Interview Two countries