Repository logo
 
Publication

Data analytics in the cloud with flexible MapReduced workflows

dc.contributor.authorGonçalves, Carlos Jorge de Sousa
dc.contributor.authorAssunção, Luis
dc.contributor.authorCunha, José C.
dc.date.accessioned2015-09-07T13:15:28Z
dc.date.available2015-09-07T13:15:28Z
dc.date.issued2013-02-04
dc.description.abstractData analytic applications are characterized by large data sets that are subject to a series of processing phases. Some of these phases are executed sequentially but others can be executed concurrently or in parallel on clusters, grids or clouds. The MapReduce programming model has been applied to process large data sets in cluster and cloud environments. For developing an application using MapReduce there is a need to install/configure/access specific frameworks such as Apache Hadoop or Elastic MapReduce in Amazon Cloud. It would be desirable to provide more flexibility in adjusting such configurations according to the application characteristics. Furthermore the composition of the multiple phases of a data analytic application requires the specification of all the phases and their orchestration. The original MapReduce model and environment lacks flexible support for such configuration and composition. Recognizing that scientific workflows have been successfully applied to modeling complex applications, this paper describes our experiments on implementing MapReduce as subworkflows in the AWARD framework (Autonomic Workflow Activities Reconfigurable and Dynamic). A text mining data analytic application is modeled as a complex workflow with multiple phases, where individual workflow nodes support MapReduce computations. As in typical MapReduce environments, the end user only needs to define the application algorithms for input data processing and for the map and reduce functions. In the paper we present experimental results when using the AWARD framework to execute MapReduce workflows deployed over multiple Amazon EC2 (Elastic Compute Cloud) instances.por
dc.identifier.citationGONÇALVES, Carlos; ASSUNÇÃO, Luís; CUNHA, José C. – Data Analytics in the cloud with flexible mapreduced workflows. In 2012 IEEE 4th International Conference on Cloud Computing Technology and Science (CloudCom). Taipei, Taiwan: IEEE, 2012. ISBN 978-1-4673-4510-1. Pp. 427-434.por
dc.identifier.doi10.1109/CloudCom.2012.6427527
dc.identifier.isbn978-1-4673-4510-1
dc.identifier.isbn978-1-4673-4511-8
dc.identifier.isbn978-1-4673-4509-5
dc.identifier.urihttp://hdl.handle.net/10400.21/5078
dc.language.isoengpor
dc.peerreviewedyespor
dc.publisherIEEEpor
dc.relationCITI/FCT/UNL-2011-2012
dc.relationStrategic Project - UI 527 - 2011-2012
dc.subjectMapReducepor
dc.subjectWorkflowpor
dc.subjectText miningpor
dc.subjectCloudpor
dc.titleData analytics in the cloud with flexible MapReduced workflowspor
dc.typeconference object
dspace.entity.typePublication
oaire.awardTitleStrategic Project - UI 527 - 2011-2012
oaire.awardURIinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/PEst-OE%2FEEI%2FUI0527%2F2011/PT
oaire.citation.conferencePlaceNew York
oaire.citation.endPage434por
oaire.citation.startPage427por
oaire.citation.title4th IEEE International Conference on Cloud Computing Technology and Science Proceedingspor
oaire.fundingStream6817 - DCRRNI ID
person.familyNameAssunção
person.givenNameLuis
person.identifier.orcid0000-0003-4858-2751
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.nameFundação para a Ciência e a Tecnologia
rcaap.rightsclosedAccesspor
rcaap.typeconferenceObjectpor
relation.isAuthorOfPublication62f832ff-4f6a-41ae-b30b-2a7952fb204b
relation.isAuthorOfPublication.latestForDiscovery62f832ff-4f6a-41ae-b30b-2a7952fb204b
relation.isProjectOfPublication216ba938-478e-4068-9333-cfed8eccc994
relation.isProjectOfPublication.latestForDiscovery216ba938-478e-4068-9333-cfed8eccc994

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
DATA ANALYTICS IN THE CLOUD WITH FLEXIBLE MAPREDUCE WORKFLOWS.pdf
Size:
3.03 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: