-
New Release. Better than the old one.
Posted on June 9th, 2009 No commentsToday marks the launch of a new version of MyDrifts. A better version. Much improved.
MyDrifts now includes GigFlyer, a powerful gig promotion feature designed to target and shoot your fans before or after they come to your show (we recommend after). Simply choose a gig you wish to promote, upload an image of your flyer and send. We’ll make sure that only the friends who are most likely to show up will receive it. And when they do, they’ll be able to listen to your tracks, get directions and RSVP in one click. You can track their responses too.
Fret not. We have considerable anti-spamming measures in place so that your campaigns will run smoothly, safely and securely.
-
Limitations of the New AWS Services Elastic Load Balancing, CloudWatch, and Auto Scaling
Posted on May 24th, 2009 No commentsWhat follows might sound very negative, pointing out only faults with Amazon’s offerings without giving them due credit for the positive features of the services. On the contrary – there is so much good stuff in these new services that it’s hard to find things to critique. The new services will save many businesses the hassle of “rolling their own” solutions.
With that said, here are some “gotchas” that lurk in Amazon’ new services:Limitations of Elastic Load Balancing
ELB is a great first-release of a highly anticipated service, but it lacks some features needed by many existing applications.
- Integration with DNS is sub-optimal.
When an ELB appliance is commissioned it is given a DNS name by Amazon, to which all traffic should be directed. In order to direct traffic to the load balancer appliance you need to set up a CNAME DNS entry redirecting traffic for a subdomain to the Amazon-supplied DNS name. A CNAME is basically a “redirect” for a subdomain: I can route all traffic for “mail.example.com” to gmail (and this is how their Google Apps for Domains service works) by setting up a CNAME entry redirecting “mail.example.com” to “aspx.mail.google.com“. The problem is, using a CNAME requires the client to perform another DNS lookup to resolve it to an actual IP address. This is an unfortunate limitation: in general, applications that use load balancing do so in order to reduce the delay perceived at the client, and the extra DNS lookup causes the client to perceive
a longer response time.Another DNS integration weakness is also due to the need for a CNAME entry: CNAMEs only allow you to redirect subdomains, not entire domains. If I am hosting my entire domain inside EC2, and I use multiple subdomains (which is encouraged, in order to allow for many kinds of optimizations) such as to separate images (images.example.com) from HTML (www.example.com) from downloadable content (download.example.com), and so forth, then I need to create a CNAME entry for each subdomain. This is OK for that simple example, but it becomes a management nightmare when your application uses more than a few subdomains that are known in advance, or when your application uses dynamically generated subdomains in its URLs.
Ideally ELB will be upgraded in the future to allow an Elastic IP address (an IP address leased from Amazon) to be connected to the ELB appliance. This will completely eliminate the need to use CNAMEs and avoid the two abovementioned lmitations.
- No integration with existing load balancing solutions.
ELB appliances can balance machines that are located within EC2 (in any availability zone within the same region), but not machines that are outside EC2. Enterprise uses who want to “cloudburst” (temporarily expanding their capacity by using the cloud to handle demand exceeding their in-house capacity) need to instruct their in-house load balancers to forward traffic to the ELB appliance. But the ELB appliances themselves do not support any of the management APIs that load balancers use to measure the health of the machines they are balancing – they cannot report their health back to your in-house load balancer. If you add your ELB appliance to your in-house load balancer, you will need to monitor and manage the two load balancers (your “real” in-house load balancer and the EC2 “virtual” ELB appliance) using completely different tools and environments.
- No session affinity
Session affinity allows the same web browsing session to consistently be forwarded to the same back-end application server. It requires support by the load balancer (usually via cookies). Session affinity is important when you want to minimize the information that your application servers must share among themselves. Elastic Load Balancing does not support session affinity, so your application servers will need to share all of the session information among themselves. Memcached is particularly suited to this task – but requires dedicated memory for the purpose.
Limitations of CloudWatch (monitoring)
CloudWatch currently has no web-based dashboard interface. Amazon promises this is coming soon, but it’s the most important feature of monitoring: the ability to *see* what’s going on, at a glance.
In addition the metrics collected by CloudWatch are not always the ones that are important to your business. More on this below.
Limitations of Auto Scaling- Fabric-level metrics are not necessarily the same language as your application’s SLA.
Auto Scaling works by monitoring metrics collected by CloudWatch, measurements of the behavior of EC2 instances and Elastic Load Balancers. These metrics measure “fabric-level” parameters: cpu utilization, network I/O, and disk I/O for instances, and for load balancers these parameters include number of requests per second and request/response latency. These fabric-level measurements are the digital equivalent of measuring blood pressure, heart rate, temperature, and testing neurological reflexes. These metrics, when they exceed a certain threshhold, can indicate a problem – but they are not always the best way to characterize the problem, and these metrics do not provide enough information by themselves to determine the appropriate response. Often the metrics you want to observe are higher in the stack, parameters such as database transactions per second or message queue length. Such higher-level metrics are generally not expressable in terms of lower-level metrics.
The business purpose served by load balancing is the ability to meet a predefined SLA. In order to meet the SLA, the service level measurements must be observable by the scaling mechanism. But application-level service agreements are not expressed solely in terms of the fabric-level metrics of cpu utilization, network I/O, disk I/O, requests per second, and request/response latency. Application-level service agreements can be expressed in terms of database transactions, message queues, memory usage, index size, etc. Unless you can measure these application-level metrics in your scaling mechanism, you may not be able to design a scaling strategy to deliver your application’s SLA.
- Only a single, single-parameter scaling trigger can be assigned to each auto-scaling group.
To mitigate the abovementioned discrepancy between the fabric-level SLA and the application’s SLA you might – sometimes – be able to express higher-level service parameters in terms of the lower-level ones. For example, database transactions at the application level might be expressed as some combination of the lower-level parameters disk I/O and cpu utilization. Unfortunately an Auto Scaling group can only be equipped with a single trigger condition, which can only track a single metric. So you would not be able to compose a scaling trigger that tracked a combination of metrics. So, if your application SLA is not measurable at the fabric layer, your auto scaling trigger will not be properly expressable and you will end up either over-deploying (if your trigger is over-sensitive and engages more instances than necessary) or under-delivering (if your trigger is under-sensitive and does not scale up according to the proper conditions) on your SLA.
- Integration with DNS is sub-optimal.
-
MySpaceID
Posted on April 28th, 2009 No commentsMySpace is a goldmine for artists. It represents the greatest number of unique relationships between artists and music fans on the web and, for many artists, serves as the primary resource for music marketing and gig promotion.
To enhance the overall user experience and consistently add new features for its members, MySpace introduced a technology called MySpaceID which enables third-parties (like MyDrifts) to ‘link’ to its platform. MySpaceID allows MySpace users to use their MySpace identity on other networks. Utilizing MySpaceID, third-party networks can leverage the friend relationships on MySpace in their applications.
At MyDrifts, we develop tools to market music, promote gigs and manage relationships. Specifically, we develop applications to improve the quality of relationships between artists and fans on MySpace as well as other social networks. One of the ways we do this is by enhancing existing MySpace communication functionality such as messaging, bulletins, and notifications, to allow better targeting of messages sent to fans. We aim to extend our offering through MySpaceID in the future.
Learn more about MySpaceID on the MySpace developer site.
-
Responsible Use Measures
Posted on April 19th, 2009 2 commentsIn order to minimize the potential for abuse of our service, we have developed a number of measures to encourage the safe, secure and responsible use of our website. In addition to making MyDrifts a spam-free zone, we hope that these measures will improve the quality of your relationships with your fans and success of your marketing campaigns. We consider these Responsible Use Measures a work in progress and will revise them from time to time in tandem with new feature releases and upgrades in our technology.
Message Automation
Group messages, bulletins, updates, friend requests, or other forms of communication intended for multiple recipients through MyDrifts (collectively referred to as “Campaigns”) may require the sender to confirm each recipient within the Campaign. For example, if you wish to send a Campaign to 200 recipients, you may be required to click ‘send’ two hundred times as opposed to just once. This measure is intended to meet the standards and Terms of Use of the social networks, including but not limited to Facebook and MySpace, whose communications features we rely on to distribute campaigns.Campaign Size
In some instances, we have placed a ‘cap’ on the number of recipients that may be included in a single Campaign. This cap, which represents the absolute maximum number of recipients the sender is allowed to communicate with as a group, may vary depending on the MyDrifts feature being used and the Terms of Use of the third-party network leveraged for the proposed Campaign.Campaign Frequency
In some instances, we have limited the number of Campaigns that can be created and sent within a 24 hour period. This policy ensures that our users will use their sending privileges responsibly and create only meaningful and targeted promotional Campaigns.Recipient Identity Privacy
We reserve the right not to reveal the identity of Campaign recipients to ensure that individual recipients are not ‘over-targeted’ by the sender, through our system or otherwise.Automatic Targeting
MyDrifts identifies Campaign recipients automatically. In the event that the system identifies more than the maximum number of recipients for a single Campaign, it will automatically prioritize the individual recipients by their projected response rate and previous activity (if any).Minimizing Over Targeting
MyDrifts will not contact the same recipient more than once in a while, in order not to harass these recipients. Thus, your campaigns will always be sent to the “best” matching recipients who have not been contacted yet within the applicable time frame. For example, if you wish to send a Campaign to 200 recipients, and 100 potential candidate recipients have already received a campaign from you last week, the system will automatically ignore these 100 recipients you previously contacted and select the next “best” 200 recipients based on their likelihood to respond positively to your Campaign.Anti-Spam Measures, Gig Promotion, Music Marketing, Music Technology, Social Network Marketing, Targeted Advertising Facebook API, Facebook Terms of Use, MySpace API, MySpace Terms of Use, OpenID, Privacy Policy, Social Network Marketing, Spam Policy, Targeted Advertising, Targeted Marketing Technology -
Data and the Cloud
Posted on December 8th, 2008 No commentsThe cloud is only as useful as the data that lives within.
Amazon’s recent announcement that they’re putting large datasets into the cloud for people to use reminded me of a throwaway comment once uttered by Sun’s Rich Zippel, VP of Technology in the CTO’s office. Around June 2006 he was discussing the cloud and its evolution. He said, almost as a side comment, that the next challenge in the adoption of the cloud is getting the data you want into it.
From today’s vantage point I can understand his comment better. Here are three angles on it.
The MyDrifts Angle: our data
From working on MyDrifts I understand that the data we have is the most critical business asset we possess. And, in order to run our product in the cloud we have to ensure that this data is safe in the cloud. That is why, when we moved to EC2 from another hosting provider, we invested in MySQL replication and periodic backup EBS snapshots. Thanks to Amazon’s low prices we could afford to run the extra CPU power and allocate the extra storage to ensure our data stays where we need it: in the cloud.
The AWS Angle: data-centric services
A high-level look at Amazon’s AWS portfolio reveals a clear indication that data is king in the cloud. Here is a (simplified) classification of AWS services available today:
- Storage: S3, SimpleDB, EBS
- I/O: SQS, CloudFront
- Computation: EC2
Of these six elements, three directly provide storage for your data in the cloud: S3, SimpleDB, and EBS. Plus, the two I/O-oriented services are really about ferrying data into, around, and out of the cloud, so these services can also be viewed as data-centric. In all, five out of six AWS service offerings are about data.
The Hadoop Angle: move the CPU, not the data
In the cloud, data is much more difficult to move than CPUs. This is why Hadoop’s MapReduce is implemented the way it is: computations are performed on the nodes that have the data locally.
Why Amazon’s move makes sense
Amazon’s move to host large data sets is ultimately, going to make it more attractive for analysts to use EC2: the data they want is already there. I nominate the NetFlix Prize data set as the next candidate to entice researchers to use EC2.
The more data that lives in the cloud, the more attractive the cloud becomes as a place to host your data-manipulation services.
