29.11.13

22.11.13

Task Management or Risk Management

Generally speaking, if we work on something and want to manage the work we are doing, we create tasks and manage them. This is kind of the nature for manage everything, and this is how the Software Project began.

However, it didn't work well in this way. Recently, one of my teams admitted that they couldn't continue their work on one of a task the way they chose before, and had to go another way. Some engineers didn't feel good for that, but I would say this is a good thing except that we have been waiting for too long before we give up. If we had admit the failure earlier and changed it earlier, this is a good practice for me.

But we didn't. The solution emerged actually a year ago, and finally put into the PO's scope recently about 5 months ago. In between, some of our engineers worked out some research, so when we began the task, we thought we are quite confident. So we planned a lot of tasks on it and tried to make the story cover everything we wanted to cover, including all the functional and non-functional requirements. We even spend a lot of time to plan for building a monitoring solution to monitor the target system.

With all those effort, and of course we've been distracted a little from the original track, but still 5 months later, we came to a conclusion that this is not the way we want to go. Hey, 5 months, what took you so long to find the failure of one approach? Even with the distraction, we should have got this conclusion in 5 man/days, because this is a high risk thing, and we need to fail fast and move on before we invest too much time on it. Instead, we have the whole business unit waiting for the target system for such as long time and now we have to change our plan.

Software development is a task, but very special one. It is by nature very unpredictable and unrepeatable. In general, all the tasks are at risk, but some are at high risk, some at at lower risk. We simply can't manage those tasks in the way that we manage others, like construction. For example, if we plan 3 tasks in a row, but we fail the second one, what can we do with the third? We can only simply re-plan everything and choose another approach, just as what we did here. Sometimes, we can only fail the whole project or target and admit the failure and move on. 

It will be very dangerous our management team don't see this kind of software development nature and try to work out the software in a traditional way. We need to apply the Risk Management strategy on it, which means we always expect failure for tasks and we need to work out the more risky work first to ensure the successful rate. We can't simply estimate the task even though we have to. When a task is at risk we need to prepare another approach early or even prepare the failure. We need to try everything in a simple way to make sure we don't add more risks onto the already risky tasks. When the solution is barely ready to ship, ship it, and yes, you have to, because their are other risks coming in. After you ship the software that the customer could use, then we can take more risks, and in this way, we can try something risky but still have money back.

I heard often our PO or engineers say that we need to have everything cover in the task list. Yes, we do want to have them, but we just simply can't. We need to admit the unpredictable nature of software development and need to build this task list along with the development and the way we build up our knowledge. We don't want to wait, because waiting is a cost. Before we can deliver a useful software, we don't want to add unnecessary cost. We need to discuss and need to plan, but in a barely good enough way. For example, we can't call in everybody every time that we need to discuss something, we need to try more individual interactions instead of global meetings. Some would ask, how can we keep every in sync? I would say, why should we keep everybody in sync? We need to decouple the system into some level so that most of the engineers just need to focus on that part they are working on. We also want to simplify the architecture so that it is very easy to understand and track down different part of the system. Then we don't need to spend time to have wide-spread meetings. If we need to transfer knowledge or share information, some simple words could do that job well. Also, we want to apply mature and popular solution, apply architecture patterns or design patterns onto the system, so that we can describe the solution in a higher level while everybody knows exactly what happen underneath. 

But first of all, we need to be fast, fail fast, and get the result fast enough. 

In conclusion, software development could not be managed as normal tasks, but only as a list or risks. We always need to prepare for failures and changes.

13.11.13

http://java.dzone.com/articles/three-motivational-forces
http://java.dzone.com/articles/why-we-shouldnt-use-more

12.11.13

Hadoop


  • http://stackoverflow.com/questions/19843032/good-tutorial-on-how-install-hadoop-2-2-0-yarn-as-single-node-cluster-on-macos
  • http://java.dzone.com/articles/introducing-spring-yarn
  • http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
  • http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
  • http://shaurong.blogspot.com/2013/11/hadoop-220-centos-64-x64.html
  • http://codesfusion.blogspot.com/2013/10/setup-hadoop-2x-220-on-ubuntu.html

http://java.dzone.com/articles/real-time-search-and-anaytics

10.11.13

Projections in Vertica


  • http://www.vertica.com/2011/09/01/the-power-of-projections-part-1/
  • https://my.vertica.com/docs/5.1.6/HTML/index.htm#1299.htm
  • http://stackoverflow.com/questions/10211799/projection-in-vertica-database
  • Projections in Vertica | Baboon IT

Create Projections:

9.11.13

Done-done or always beta?

Some development team advocates a delivery strategy for done-done-with-high-quality, which means the engineers won't deliver their code to the main branch until the feature the code related to has been tested thoroughly, including unit tests, integration test, end-to-end test and scale test.

I know a team that has been using this strategy for three years, and release several version of the software. It is quite obvious the team is not famous for good velocity. But they were quite confident with the quality until the customer crisis broke out suddenly. Many customers went into big issues, all kind of issues, so that the team has to stop developing new features and focus on fixing the problems on the customer side.

Why a team who has been advocating quality ends up facing quality issue? There are many reasons. Technical and non-technical. If we put the technical reason away for a while, I would blame the done-done strategy most.

The done-done strategy is based on the believe that quality can be acquired by enough tests. So if we sacrifice something like development time and spend more on test, we can achieve a certain level quality.

This is not true at all. Software development is very dynamic and affected by many factors. "Building software is by its very nature unpredictable and unrepetitive." It is not possible to set up a criteria for a certain feature, have it go through all the criteria and then call it done. We simply don't know the real criteria yet.

I am not saying we don't test the system. What I am saying is don't obsessively test the system. Unfortunately, there is no clear boundary for non-obsessive testing. One thing we do know is that we should not ask the engineer to raise the bar to the so called done-done-with-high-quality level. In this way, you are asking for promise. To make sure we keep the promise, we have to obsessively test the system until the managers have to give up and ask us to stop. In this way, nobody will be accused for quality. Then, the morale of the team will continuously go down, the velocity will go down, and because we don't have enough iteration or enough time for final verification before we deliver the software to the customers, the quality goes down.

There is a better way, which is called always beta. We emphasize the importance of velocity, and call for dynamic development. We drive the development based on risks instead of static quality. We review the software functionality without requiring perfect implementation. But we pay more attention to the architecture and code quality, to make sure if we want to change something or improve something, it would be easy to do it. Simplicity is one of the important feature of the system. Since we don't expect perfect system, it is more likely that we can implement the system in the most simple way and avoid most of the complex logics.

Also, we have the team deliver the code more frequently, and everyone in the team see the system, feel the system, and run the system. With more eyes, we can find out more issue before it reach the customers. We always prepare for integration problems, and fix it very quick. We also take some simple error seriously if our engineers think it is important and quality related - yes, let the engineers tell us whether this is important or not. We keep on the refactoring work and continuously improve the design and code quality and keep the system as simple as possible, as readable as possible.

With the always-beta thought in mind, we have more time to focus on things that we don't know upfront. And usually, this is the part the real quality issues may emerge.
http://java.dzone.com/articles/code-made-me-cry
http://java.dzone.com/articles/what-if-java-collections-and

8.11.13

Deletion in Vertica



DELETE_VECTORS

Holds information on deleted rows to speed up the delete process.
Column Name
Data Type
Description
NODE_NAME 
VARCHAR
The name of the node storing the deleted rows.
SCHEMA_NAME
VARCHAR
The name of the schema where the deleted rows are located.
PROJECTION_NAME
VARCHAR
The name of the projection where the deleted rows are located.
STORAGE_TYPE
VARCHAR
The type of storage containing the delete vector (WOS or ROS).
DV_OID 
INTEGER
The unique numeric ID (OID) that identifies this delete vector.
STORAGE_OID
INTEGER
The unique numeric ID (OID) that identifies the storage container that holds the delete vector.
DELETED_ROW_COUNT
INTEGER
The number of rows deleted.
USED_BYTES
INTEGER
The number of bytes used to store the deletion.
START_EPOCH
INTEGER
The start epoch of the data in the delete vector.
END_EPOCH
INTEGER
The end epoch of the data in the delete vector.
PURGE
Purges all projections in the physical schema. Permanently removes deleted data from physical storage so that the disk space can be reused. You can purge historical data up to and including the epoch in which the Ancient History Mark is contained.
Syntax
PURGE()
Privileges
  • Table owner
  • USAGE privilege on schema
Note
  • PURGE() was formerly named PURGE_ALL_PROJECTIONS. HP Vertica supports both function calls.
Caution: PURGE could temporarily take up significant disk space while the data is being purged.
See Also
Purging Deleted Data in the Administrator's Guide

5.11.13