funsung: meetup

Showing posts with label meetup. Show all posts

Thursday, October 02, 2014

DockerCon 2014

http://www.dockercon.com/agenda.html
http://googlecloudplatform.blogspot.com/2014/06/an-update-on-container-support-on-google-cloud-platform.html
https://github.com/GoogleCloudPlatform/kubernetes

Wednesday, September 03, 2014

Spark MLlib

http://spark.apache.org/docs/latest/mllib-guide.html

Binary Classification

SVMWithSGD
LogisticRegressionWithSGD

Linear Regression

LinearRegressionWithSGD
RidgeRegressionWithSGD
LassoWithSGD

Clustering

KMeans

Collaborative Filtering

Gradient Descent Primitive

GradientDescent

[Note] spark/examples/src/main/scala/org/apache/spark/examples/mllib

Thursday, August 28, 2014

HA cluster

http://en.wikipedia.org/wiki/High-availability_cluster

The most common size for an HA cluster is a two-node cluster, since that is the minimum required to provide redundancy, but many clusters consist of many more, sometimes dozens of nodes. Such configurations can sometimes be categorized into one of the following models:

Active/active — Traffic intended for the failed node is either passed onto an existing node or load balanced across the remaining nodes. This is usually only possible when the nodes utilize a homogeneous software configuration.

Active/passive — Provides a fully redundant instance of each node, which is only brought online when its associated primary node fails. This configuration typically requires the most extra hardware.

N+1 —

N+M —

N-to-1 —

N-to-N —

Wednesday, August 27, 2014

Apache Sentry: Enterprise-grade Security for Hadoop

http://www.slideshare.net/cloudera/hug-apr-2014-apache-sentryfinal
http://gethue.com/category/presentation/

Apache Sentry: Enterprise-grade Security for Hadoop from Cloudera, Inc.

HBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQL from Cloudera, Inc.

Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup from gethue

Monday, August 18, 2014

Build a Realtime Chat App with GoAngular

http://goangular.org/
https://developers.goinstant.com/v1/GoAngular/getting_started.html

Tuesday, August 12, 2014

Recommender Systems

http://en.wikipedia.org/wiki/Recommender_system
https://www.coursera.org/course/recsys
http://www.ibm.com/developerworks/library/os-recommender1/
http://thirdeyecss.com/

Thursday, July 24, 2014

Infinite Cloud Session Clustering with Apache Shiro

http://www.slideshare.net/planetcassandra/infinite-sessionclusteringwithapacheshiro9x16-23252714
http://www.slideshare.net/planetcassandra

C* Summit 2013: Remember Me! Session Clustering with Cassandra by Les Hazlewood from Planet Cassandra — A DataStax Community Service

Monday, July 21, 2014

http://www.slideshare.net/elasticsearch/elasticsearch-at-berlinbuzzwords-2010
http://www.youtube.com/watch?v=fEsmydn747c
http://solr-vs-elasticsearch.com/
http://www.google.com/insidesearch/features/search/knowledge.html
http://news.walmart.com/news-archive/2012/08/30/walmart-announces-new-search-engine-to-power-walmartcom
http://techcrunch.com/2012/08/30/in-battle-with-amazon-walmart-unveils-polaris-a-semantic-search-engine-for-products/

ElasticSearch at berlinbuzzwords 2010 from elasticsearch

Thursday, June 26, 2014

Linkedin channel

http://www.youtube.com/user/LinkedIn
http://www.ustream.tv/linkedin-events

Friday, June 20, 2014

PaaS architecture : VM(sandbox) vs. container

http://blog.dotcloud.com/under-the-hood-linux-kernels-on-dotcloud-part
http://blog.dotcloud.com/kernel-secrets-from-the-paas-garage-part-24-c
http://blog.dotcloud.com/kernel-secrets-from-the-paas-garage-part-34-a
http://blog.dotcloud.com/kernel-secrets-from-the-paas-garage-part-44-g
http://blog.dotcloud.com/under-the-hood-dotcloud-http-routing-layer

https://www.openshift.com/blogs/paas-evolves-with-linux-containers-and-openstack
http://www.infoq.com/articles/PaaS_Is_The_Word
http://java.dzone.com/articles/paas-service-orchestration-vs

http://www.linuxjournal.com/content/containers%E2%80%94not-virtual-machines%E2%80%94are-future-cloud
http://www.networkworld.com/community/blog/baidu-chooses-docker%E2%80%99s-containers-over-sandbox-paas
http://java.dzone.com/articles/essential-characteristics-paas

Application Infrastructure Platform

API-Based PaaS

Container-based PaaS

Wednesday, June 18, 2014

Docker: the Linux container engine

https://www.docker.io/
https://www.docker.io/gettingstarted/
http://blog.blackwhite.tw/2013/12/docker.html
http://www.ithome.com.tw/news/91848

Docker introduction from Docker

Docker in pratice -chenyifei from Docker

Wednesday, May 28, 2014

Rails + MongoDB

http://docs.mongodb.org/ecosystem/tutorial/getting-started-with-ruby-on-rails-3/
http://www.rosipov.com/blog/rails-and-mongodb-with-cygwin/
http://garypickrell.wordpress.com/2012/02/16/installing-rvm-and-rails-on-cygwin/

http://gorails.com/blog/rails-4-0-with-mongodb-and-mongoid
http://support.mongohq.com/tutorials/rails-4-mongoid-heroku.html
http://blog.mongodb.org/post/53271876885/ruby-rails-mongodb-and-the-object-relational-mismatch
http://docs.mongodb.org/ecosystem/tutorial/getting-started-with-ruby-on-rails-3/

On cygwin:

1. rails new my_app --skip-active-record
2. cd my_app
3. vi Gemfile (gem 'mongoid', git: 'https://github.com/mongoid/mongoid.git')
4. bundle install
5. rails g mongoid:config
6. rails g scaffold Project name:String status:String
7. rails server

Go to "projects" web interface to add something, you can see it in mongodb:

[Note] my_app_development is specified by config generated in step 5 (config/mongoid.yml)
[Note] mongodb default db dir : mkdir -p /cygdrive/c/data/db

Tuesday, May 27, 2014

node.js

http://howtonode.org/how-to-install-nodejs
http://www.toptal.com/nodejs/why-the-hell-would-i-use-node-js
http://elegantcode.com/2010/11/08/taking-baby-steps-with-node-js-introduction/
https://blog.heroku.com/archives/2014/3/11/node-habits
https://github.com/joyent/node/wiki/node-hosting

install npm:
$ curl http://npmjs.org/install.sh | sh

KSDG meet-up #1 from ericpi Bi

Yeoman - A Node.js cli tool for web developers from Caesar Chi

[Note] npm install [-g/global]

Sunday, May 25, 2014

HDFS Architecture

http://hadoop.apache.org/docs/r1.0.4/hdfs_design.html

HDFS has a master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on.
HDFS exposes a file system namespace and allows user data to be stored in files. Internally, a file is split into one or more blocks and these blocks are stored in a set of DataNodes. The NameNode executes file system namespace operations like opening, closing, and renaming files and directories. It also determines the mapping of blocks to DataNodes. The DataNodes are responsible for serving read and write requests from the file system’s clients. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode.

Friday, May 09, 2014

Building Hadoop Data Applications with Kite

http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop/
http://blog.cloudera.com/blog/2012/10/analyzing-twitter-data-with-hadoop-part-2-gathering-data-with-flume/
http://blog.cloudera.com/blog/2012/11/analyzing-twitter-data-with-hadoop-part-3-querying-semi-structured-data-with-hive/
http://blog.cloudera.com/blog/2013/03/how-to-analyze-twitter-data-with-hue/
https://github.com/cloudera/cdh-twitter-example

http://hortonworks.com/use-cases/sentiment-analysis-hadoop-example/
http://hortonworks.com/hadoop-tutorial/using-spring-xd-to-stream-tweets-to-hadoop-for-sentiment-analysis/

Building Hadoop Data Applications with Kite by Tom White from The Hive

Analyzing twitter data with hadoop from Open Analytics

Wednesday, May 07, 2014

Hadoop YARN : Next Generation Hadoop processing framework

http://wiki.apache.org/hadoop/PoweredByYarn/
http://wiki.apache.org/hadoop/NextGenMapReduce
http://dongxicheng.org/mapreduce-nextgen/apache-tez/
http://dongxicheng.org/mapreduce-nextgen/nextgen-mapreduce-introduction/
http://java.dzone.com/articles/next-generation-hadoop-its-not
https://speakerdeck.com/alexholmes/javaone-2013-presentation-next-generation-hadoop-its-not-just-batch
http://hortonworks.com/hadoop/tez/
http://hortonworks.com/blog/understanding-hadoop-2-0/
http://hortonworks.com/hadoop/yarn/

Hadoop 1.0 is based on the Hadoop .20.205 branch
Hadoop 2.0 is from the Hadoop 0.23 branch

Sunday, April 20, 2014

MongoDB sharding architecture

http://docs.mongodb.org/manual/core/sharded-cluster-architectures-production/
http://idning.github.io/nosqlrst.html
http://leehom59.blogspot.com/2011/11/mongo-db-sharding.html

Monday, April 14, 2014

Heroku : language (web framework) support

https://blog.heroku.com/archives/2011/8/3/polyglot_platform
https://devcenter.heroku.com/categories/language-support
http://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop

Ruby/Java/Python/Clojure/Scala/Node.js/Play

[Note] REPL : A read–eval–print loop (REPL) is a simple, interactive computer programming environment.

Saturday, April 05, 2014

APACHE ACCUMULO

https://accumulo.apache.org/

NSA version of HBASE

The Apache Accumulo™ sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system. Apache Accumulo is based on Google's BigTable design and is built on top of Apache Hadoop, Zookeeper, and Thrift. Apache Accumulo features a few novel improvements on the BigTable design in the form of cell-based access control and a server-side programming mechanism that can modify key/value pairs at various points in the data management process. Other notable improvements and feature are outlined here.
Google published the design of BigTable in 2006. Several other open source projects have implemented aspects of this design including HBase, Hypertable, and Cassandra. Accumulo began its development in 2008 and joined the Apache community in 2011.

HBase and Accumulo | Washington DC Hadoop User Group from Cloudera, Inc.

Thursday, March 27, 2014

Understanding BigData

http://en.wikipedia.org/wiki/Big_data
http://www-01.ibm.com/software/data/infosphere/hadoop/
http://www-01.ibm.com/software/data/infosphere/hadoop/pig/
http://www-01.ibm.com/software/data/infosphere/hadoop/hive/
http://www-304.ibm.com/industries/publicsector/fileserve?contentid=239170

"Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization."

Understanding Big Data - IBM
Demystifying Big Data - IBM