Archive for the ‘Architecture’ Category

Articles

False assumptions on the minimum viable product

In Architecture,Business,Marketing,Product Development,Uncategorized on September 16, 2010 by petrem66

If you use Geoffrey Moore’s book ‘Crossing the Chasm’ as a general guideline on how to tap into a profitable market niche on the long run, the Lean Startup concept must be strongly refined. True, that the aim is still to lower the waste to the minimum during startup, but one has to admit that conducting market research with screenshots, mockups, or partially working prototypes is also a form of waste. Who is going to take you seriously and allocate precious time with you talking about screenshots?. Friends, colleagues, relatives, people who are not busy running real businesses. I’ve talked to people in my social network and it is hard to convince them to even consider introducing me to really busy people. Do I have a case to push for that? No, not yet. I know that the path to the ‘chasm’ is a different one though

I think that the absolute minimum to get to playing the market fit game is to have a functioning core product which is easy to customize at the front end (that is UI) to allow for testing market hypothesis as they’ll come. Only after that is done I can afford to do proceed with testing – learning – realigning or extending, and when possible get paid for any service I will provide to customers my product will manage to get

What is the functioning core product?

Generally speaking the core product consists of independent functionality, a minimum set of building blocks absolutely needed to use in any ‘solutions’ to solve in the problem domain. In all startups a common building block is the payment functionality. The building blocks must help lower the cost of putting together the minimum viable product so that when testing a market hypothesis you can charge for service should you find a real customer. Only by making a revenue you know for sure that your solution solves a real problem and hope for a solid market traction

If after a long struggle to find customers and problems to solve, you’ve got a customer but cannot charge for your service that’s bad. A business is about selling goods and services, not volunteer work. Signup for membership is also part of the core functionality

In my case, the document engine is also a core building block since my problem domain with www.documentclick.com is all about documents

What core product is not?

Although there are aspects in it that are part of the core functionality, the user interface in general must not be listed for the core product. It will be refined over and over for better SEO, appeal for customers, usability, etc. Adapters to third party platforms (such as integration with salesforce.com, zoho.com, or google apps) may come later as you discover a market niche for them.  SEO is not yet a concern.

Building generic frameworks to assemble building blocks in never a good idea. There are at least a few dozen excellent frameworks out there (including CMS/DMS) that can be ‘customized’ with minimum cost to include your building blocks and specific UI for a minimum viable product

Advertisements

Articles

On securing the persistent data on ‘cloud’

In Architecture,Product Development on April 15, 2010 by petrem66 Tagged:

The application’s foundation has to be build so that it will stand future heavy requirements like security and privacy compliance. One of the most important aspects that need to be well think of is information access outside the running code.
In any cloud based deployment, one can use persistent file system to store block of data like capabilities such as Cloud Files in Rackspace, EC2 in Amazon WS etc. This is especially true when such data must be shared among many server instances that form your application. Three parts are interesting to mention with regards to such block of data:
– encryption
– compaction
– integrity check

Encryption

A very detailed discussion about this topic can be found at Core Security Patterns. Suffice to say is that one can choose from various encryption algorithms at hand for Java developers for that. Since the block of data is not shared outside the application, one should go for a common encryption key. If that is hard-code somewhere in a reusable piece of code, it is safe to assume that it is highly unlikely that a malicious third party would be able to get it from there. Cracking the key of strong encryption algorithms like AES on 192 bit is quite a challenge, but even so one can imaging a strategy of changing such a key on a regular bases (like weekly or monthly) accompanied with a data migration task (re-encryption that is)

Compaction

In order to save on storage space and network bandwidth, one should consider archiving the blocks of data. Java runtime comes with a neat wrapper package of the GZiP compression called java.util.zip. The snippet of code below will do the job of achiving/expanding of your block of data:

public byte[] expand(byte[] buffer) throws Exception {
ByteArrayOutputStream os = null;
try {
os = new ByteArrayOutputStream();
ByteArrayInputStream is = new ByteArrayInputStream(buffer);
GZIPInputStream zin = new GZIPInputStream(is);
byte[] ret = new byte[4098];
for(int i=0; (i = zin.read(ret))>0;)
os.write(ret, 0, i);
return os.toByteArray();
}
finally {
if (os != null)
try {
os.close();
}
catch(Exception e){}
}
}
public byte[] archive(byte[] buffer) throws Exception {
ByteArrayOutputStream os = null;
try {
os = new ByteArrayOutputStream();
GZIPOutputStream zout = new GZIPOutputStream(os);
zout.write(buffer);
zout.finish();
zout.flush();
return os.toByteArray();
}
finally {
if (os != null)
try {
os.close();
}
catch(Exception e){}
}
}

Integrity check

From a consuming application prospective it is important to be assured that the block of data is not tampered. Usually, the producing application accompanies it with an MD5 based sum which it can pass along with the block of data descriptor to the consumer. The md5sum can be obtained with Java API using the java.security.MessageDigester (see the code snippet below)

public boolean match(String md5sum, byte[] buffer) throws Exception {
String s = getMD5sum(buffer);
return s.endsWith(md5sum);
}
public String getMD5sum(byte[] buffer) throws Exception {
byte[] sum = MessageDigest.getInstance(“MD5”).digest(buffer);
StringBuffer sbuf = new StringBuffer();
for (int i = 0; i < sum.length; i++) {
int c = (int) sum[i];
if (c < 0)
c = (Math.abs(c) – 1) ^ 255;
sbuf.append(Integer.toHexString(c >>> 4));
sbuf.append(Integer.toHexString(c & 15));
}
return sbuf.toString();
}

The string that results from calling getMD5sum can be stored along with the block name and passed to the consuming application as such.

Articles

Using Rackspace Cloud Files – 1

In Architecture on April 11, 2010 by petrem66

Rackspace Cloud Files is an inexpensive way to persist private content that needs to be available on a longer period of time, but it is not practical for use as shared file system for Rackspace cloud servers. The storage does not expose internal IP therefore whenever the application needs to load or save to it objects, they will charge you network fees. In order to assess the benefits versus short fallings of using the Rackspace FS in my design I conducted a set of tests.

This blog is the starter of a series on experimenting with CF

Test premise

My web application is a virtual J2EE based server consisting of a number of cloud based façade servers, and a grid of worker to do the actual processing on files. In between I have a simple message queue to disconnect the two groups of servers.

Assuming that an action triggered by the user causes an object O1 to be transformed into O2 and that user expects to be presented O2, the question is what is the best to employ Rackspace Cloud Files?. The criteria for best are cost, complexity, and performance (response time)

Possible architecture

I am considering two scenarios of usage of Rackspace CF by the components of the virtual server:

  • virtual file system, when both the application on the Web server and the Worker processing the request can access it to get and store objects (the content of the file)
  • lazy repository, when only the Worker can access it. The application on the Web server gets the content of the file through the Worker at the end of processing

Using Rackspace CF As Virtual File System

  1. User request
  2. Web application publishes request for processing
  3. Worker gets the request
  4. Worker gets the input file O1
  5. After processing, the Worker stores the result as O2 to cf
  6. Worker publishes work done with follow up Id
  7. Web application gets the follow up for processing completion
  8. Web app gets O2 from cf
  9. Reply to user

Using Rackspace CF As a Lazy Repository

  1. User sends request
  2. The Web application publishes request for processing
  3. The Worker gets the request
  4. The Worker gets the input file O1
  5. The Worker processes O1 and keeps it as O2 locally
  6. The Web application gets the reply from the queue
  7. The Web application gets O2 from the worker and puts it in the response body
  8. The user receives O2
  9. The worker stores O2 to CF

Both scenarios have strength and weaknesses. Using Rackspace CF as a virtual file system, the application can make up a transactional-like process to cover for the case when the worker, or the queue can go under before the response is propagated back. Also it is a true grid architecture in that processors are completely isolated from the facade, but it is expensive in terms of decreased speed, and extra cost. The communication between Rackspace servers and CF is regarded as external traffic. They charge any GB of inbound and outbound, so for any 1 MB of O2, it would cost 3 X 1 MB to reach the user’s browser (see steps 5, 8, 9). Also, if O2 needs to be transient as per some business cases, storing it into CF is a waste. I need to test the speed penalty for external traffic vs internal traffic for this scenarion

Using Rackspace CF as a lazy repository, the cost due to O2 transportation over the network is reduced to 2 x 1MB (see step 9, and 10). Also, in case O2 is transient, the application can discard it after usage saving the cost of storage on the CF, but such configuration is not true grid and it would be a bit more work to ensure reprocessing in case of a Worker lost